Archive for the ‘Terracotta developer’ Category.

Weekly Summary

I spent last week entirely on one issue: CDV-244 (and it’s duplicate CDV-736, which someone raised recently on a forum). I got a lot of coding done, but ultimately did not yet fix the issue. This week I need to put that aside and do some performance testing, my team owes marketing some comparative testing between versions 2.5 and 2.6 of Terracotta. So, it was another frustrating week of not actually finishing anything.

The issue itself is interesting. Serializing an object in the L1 which is not fully faulted into memory can fail if the serialization uses sun.misc.Unsafe.getObject(Object, long), which is a native method and which bypasses all of our bytecode instrumentation magic. Our solution, which is becoming somewhat of a pattern with native methods, is twofold: (1) use instrumentation to create a new wrapper method within Unsafe, called __tc_getObject, which does all of the necessary resolution of the Object arg if it is clustered, and (2) instrument any and all instrumented code which calls the original native method to call this instead. (This is similar to the way we tackled String.intern(), another native method.)

Our product manager Taylor wrote a nice simple TIM which reproduced the issue. I took that and modified it so that I could step through using Eclipse’s debugger. I also came up with one of our automated system tests which currently fails, and which I hope will prove the fix works, once the fix is done. At this point I’ve got a fix that I think should work, but the boot jar tool complains about the boot jar being invalid during startup. I’ve decompiled and tested the instrumented copy of Unsafe from within the boot jar and it is fine, so I’m not yet sure what the problem is.

Friday was the last day for my teammate Antonio, who has accepted a job with NASA. We are extremely sorry to see him go. He has been on the team for some time, he knows a lot, and has done tons of great coding. He is very smart, works hard and is a nice guy. Good luck in space, Antonio!

Weekly Summary – TC Spring

This week is already underway, but here’s what was up last week.

Mostly I worked on this issue: http://jira.terracotta.org/jira/browse/CDV-569. The gist is that Terracotta Spring has a bug when you are running as the root web app inside Tomcat – it didn’t properly parse the application name, or in this case use the reserved ROOT app name. (When running TC Spring as the root web app, within your tc-config.xml file, you would want to use ROOT, in all caps, as your application name: .)

I’ve still got a lot to learn about Terracotta Spring integration. It’s particularly hard to debug, because clustering of Spring beans happens through aspects (using Aspectwerkz mixins).

Incidentally, here are three good links to information about Terracotta Spring support.

Weekly Summary

At work there can be more productive weeks, and there can be less productive weeks. Last week, for me, frustratingly, was a less productive week. It was a reminder that working on multiple things at once isn’t good if you don’t get any of them done.

Part of the reason is that we are simply in that post-code-freeze mode of fixing bugs, testing, doing spring cleaning, that sort of thing. By its very nature you tend to work on a lot of little things.

During the first part of the week, I tried to help debug a seg fault we were seeing in the vm when using the ibm jdk, but I got nowhere. One of my teammates was making some progress when it was decided that other issues were higher priority.

So I switched to bug fixing. The best progress I made all week was on this issue. I have the issue reproduced in a system test, but I’m stalled on a solution and am waiting to talk it over with my team. It’s an interesting question – how should we handle read-only style methods (like Calendar.get()) which, counterintuitively, modify the internal state of the instance? In theory you should be able to use a mutable but safely-published instance in multiple threads without any synchronization, if it’s state is never changed. But, Terracotta throws a runtime exception if an attempt is made to change the state of a clustered object outside of a lock, even if that change is unexpected such as in the case of Calendar.get().

I then switched to this issue, which has to do with Terracotta integration with Spring. I’ve got the problem reproduced in a unit test, but once again I’m not entirely sure what the solution should be, so this morning I’m looking to fire up Spring and Terracotta for real and play around with various patches.

I did do my first phone-screen for Terracotta this week. I talked to a QA Engineering candidate. (Terracotta is hiring).

Here’s hoping for a more productive week this week…

Revenge of Hello Terracotta

When my manager Alex first started at Terracotta, he blogged a simple Hello Terracotta example to demonstrate POJO clustering in action. When I first started last month, he walked me through this same example and went into quite a bit more depth, decompiling some code and showing me a glimpse of how client POJO’s are instrumented by Terracotta and thus plug into our clustering framework. Here’s what we did.

First, if you haven’t yet, please go and run through Alex’s Hello Terracotta example. I’ll be starting right where he left off.

Now, as Alex pointed out to me, I need to make a slight correction to the example POJO code he is using. Here is the code:

The correction is the commented out bit. The compiler complains about that code, since the root field is final. And anyway, it is not necessary – if an L1 node starts up and Terracotta sees that the clustered root already exists, Terracotta will ignore the assignment in the source code (Item 2 above) and instead set that field to the pre-existing root. That is why, if you run this sample program twice in a row without bouncing the Terracotta server, you will see that the counter continues to increment from where it left off the first time. Terracotta clustered objects persist.

To continue, today when I run this example I am adding the special super-secret -Dtc.classloader.writeToDisk=true flag to the vm flags when I start up the dso process. My full command looks like this:

dso-java.sh -Dtc.config=config/tc-config.xml -Dtc.classloader.writeToDisk=true -classpath bin test.HelloTerracotta

This additional flag causes Terracotta to dump the modified classes under ~/adapted/. Under that folder you should find a test/ directory containing the instrumented HelloTerracotta.class file.

At this point you can examine the raw bytecode to see what we’ve added – Terracotta uses ASM to instrument bytecode on the fly in order to cluster arbitrary client code.

Or, you can decompile the file like I did using Jad. This likely won’t be able to decompile everything all the way, but it’s close enough that you can see what’s going on.

When I decompile HelloTerracotta using Jad, after cleaning up the go() method by hand, this is what I see:

First thing to notice, the instrumented version of the class now implements two interfaces, Manageable and TransparentAccess. These are both in the dso-l1-api project if you download the Terracotta source code. The api is our internal api for dealing with all clustered objects in the L1 nodes.

Next, notice that all direct reads of the root field have been replaced with calls to a new dynamically-generated Set __tc_getroot() method, and the one spot in the constructor where we were setting the root field directly is now replaced by a call to a new __tc_setroot(Set) method. And you can see that those getter/setter methods are doing the work of ensuring any pre-existing root is set on this object, or else creating the new clustered root.

You can see that there are a lot of calls into ManagerUtil – this is a dynamically clustered object’s hook into the Terracotta runtime. ManagerUtil can be thought of as a facade of static methods, encapsulating a more interesting system which I won’t go into here.

The go() method didn’t decompile entirely correctly, so I’ve had to tweak it by hand into what you see here. What I find most interesting is that now, in addition to the synchronized block, there are calls (in a try-finally block) to ManagerUtil to acquire/release a clustered lock. Remember – since the root field is actually a clustered object, modifying it requires obtaining a cluster-wide exclusive write lock. It’s also interesting to note that a chunk of that tc-config.xml file, specifically the lock information including the method expression, is being passed to the monitorEnterWithContextInfo method.

In conclusion, this example illustrates the pattern for any POJO being clustered by Terracotta. Terracotta dynamically instruments a class to adhere to the dso api, an api for handling all of the distributed objects and locks within Terracotta. Of course that barely scratches the surface of all that Terracotta does, but at least you can see that conceptually what Terracotta is doing behind the scenes to cluster your code is really not all that complicated.

For further reading, the Terracotta website offers some very straightforward articles about how Terracotta works, and how Terracotta scales. Additionally, there are quite a few other Terracotta employees who blog, and their blogs are listed here.

Weekly Summary

Here’s what I’ve been working on this week at Terracotta.

The Terracotta 2.6 stable 0 release was released this last week. Personally, my contribution was to help implement the dynamic String compression feature. This week, emphasis shifted from achieving feature-freeze to testing.

We wanted to add some tests around serializing Terracotta-instrumented objects between instrumented and uninstrumented code, particularly our new compressed Strings. We needed to make sure that we hadn’t broken serialization with our bytecode manipulation – particularly for String, which is a special case (although this wasn’t apparent to me until I read the source code for java.io.ObjectInputStream#readString).

I mentioned that we have a system test framework, built on JUnit, which is able to start up and run multiple L1 nodes in a test fixture, complete with instrumented classes. My first attempt was to try to test serialization in-process. That is, for each Class I was interested in testing serialization for, I wanted to load two different Class instances (from two different ClassLoaders) and serialize a sample object back and forth using both Classes. The first ClassLoader would be the normal Terracotta ClassLoader, which loads the instrumented version of a Class (either from the Terracotta bootjar or using dynamic instrumentation – in this case I was just trying to test the bootjar classes). I wrote a second, custom (nondelegating) ClassLoader which would load a non-instrumented Class for the same classname – it did this by loading from the unmodified bootclasspath (without Terracotta bootjar prepended to it). This was tricky, since (as my teammate Tim showed me) ClassLoader is hardcoded to not allow loading of any class beginning with “java.”, except from the bootstrap loader only. However, galvanized with the spirit of reckless abandon that I’ve come to embrace since working here, I quickly circumvented this with reflection:

Don’t try this at home. Anyway, this actually worked – I was able to get a test running which loaded different Class objects for each Class I was interested in, including the JDK classes.

However, I was thwarted in my attempts to test Java Serialization in process. I haven’t taken the time to track down the exact cause, but no matter what I tried I either got java.lang.LinkageErrors or some other problem I can’t remember. I’m sure it’s something to do with the fact that my hacked ClassLoader doesn’t play well with serialization, somewhere deep in the bowels of ObjectInputStream or ObjectOutputStream.

As a compromise, I honed in on compressed Strings specifically. At Alex’s suggestion, I manually serialized a sample large String to a file, using a regular non-instrumented String. Then I wrote a test which took the same String, compressed it, and then Serialized it, and I compared the two resulting byte[] arrays. Luckily, the test passed.

I also added some more unit tests to our StringCompressionUtil class, which is the class that actually does the compression of the String for us. It has two responsibilities: (1) actually compressing the byte[] array which we get from the original String, and (2) encoding that compressed byte[] array as a char[] array which can be stuck back into the same String instance. In the course of adding more tests I did a little refactoring and was pleased to find that even with such low-level code I was able to eliminate some duplication and extract some methods which made it a little more clear what our algorithm was intended to do.

At the end of the week I was just beginning to look into some monkey failures. The monkey machines are our continuous integration machines – read all about them here at Hung’s blog. Basically we do continuous integration on a whole bunch of different OS’s using different JDK’s and versions, all the time. This particular failure was using the IBM JDK on a linux machine.

San Francisco Trip

Last week we all traveled to Terracotta headquarters in San Francisco for our quarterly get-together. This was my first trip since I was hired. Alex and I flew out together on Wednesday morning.

Overall it was a fantastic trip. I got to finally meet in person most everyone else in the company, including the rest of my teammates, who I’ve so far only talked to on the phone or over IM. We all went skiing on Thursday. We ate some delicious food. And we even got a little work done.

Here are some highlights:

Wednesday – nonstop flight out was slightly delayed due to fog in San Fran that morning, but we got to the office by early afternoon. We introduced ourselves around – I met Abhishek (a teammate on the Transparency team), Nabib, Raghu, Himanshu, Kunal, Iyer, Dwayne, Jason, Tim M, Taylor, Jeff, Peter, and Christine.

Alex and I grabbed some lunch, then we had a code review of the new “lock leasing” feature which my teammate Antonio worked on. Very interesting stuff, I’d like to blog about it separately. Tim and Saravanan were in that code review also.

After work Alex and Jason and I went to a wine bar, then to a great Thai food restaurant, where we met Juris.

Thursday – we (all of Engineering) got up early and drove in three cars to Northstar in Lake Tahoe, arriving about 9:00. Madcap Hijinx ensued.

Specifically, many of us took lessons. I’ve been skiing since I was six, but I decided to give snowboarding a try. I had previously snowboarded for about 30 minutes and it didn’t go so well, so I decided to take a lesson. Antonio and Himanshu were also in my class. Meanwhile Alex, Nabib, Jason and Abhishek took skiing lessons right next to us. All of us first timers did surprisingly well. The only injury was a nosebleed – Nabib’s face hit someone’s elbow after someone else ran into him.

The rest of the guys made sure to come heckle us near the end of our lessons. It was during this time that I got to witness EY on skis. I’d characterize his style of skiing as: gutsy, fearless and fast. He is very good at skiing. But, he is not so good at stopping. From what I could tell he usually stopped himself by throwing himself to the ground, usually losing one or both skis in the process. One fearless teammate who was brave enough to get in EY’s path in an attempt to snap some action photos had to throw himself clear as EY barreled past the lift line at the bottom of the hill. This gave rise to a number of jokes such as “stopping is for wimps!” and something about EY skiing all the way back to San Francisco.

I met Geert this morning, who was in from Belgium along with his girlfriend Nathalie. He didn’t want to ski due to his back.

At 1:00 we all met for lunch and spent the hour wondering if Manoj was still sliding face first down the mountain. Apparently they took him onto a steep blue, assuring him “aww, blues are basically just like bunny slopes”. Manoj thought it more prudent to slide on his belly then attempt to snowboard – I don’t blame him, he hasn’t been boarding for very long. Unfortunately for him, his teammates took some incriminating photos (which will hopefully be available soon).

That afternoon I left the bunny slopes and attempted the green slopes with everyone else. I was quickly left in the dust, but I was able to get to the bottom in one piece, and without falling too many times. Taking a lesson was definitely helpful – snowboarding is so different from skiing. I was a beginner all over again, having to think about each turn I was about to make. Raghu and I closed out the afternoon by taking some runs together, including my first blue slope. By then I was exhausted, and it was time to go.

That night I walked with Manoj, Raghu and Himanshu to their favorite nearby Indian food restaurant, Tandoori Mahal. Everything was delicious. We were all nearly exhausted but also hungry. It was great hanging out with those guys and getting to know them a little better.

Friday – I think the whole company slept in. Eventually, Alex, Geert, Nathalie and I took the shuttle over to the office. This was the day I actually got some work done. First, Tim walked Alex and I through the TC system test architecture, which are test classes built on JUnit which can run L1’s and L2’s in-process. Using this newfound knowledge, I wrote a test which proved that String interning was sometimes broken when the String was compressed – the fix was simple and the failing test then passed. Ahhhh.

Also Friday morning, my hacking String blog entry reached #1 on dzone, and my teammates gave me a sticker as a prize! I had my choice of stickers, so I selected the one which read “My Team Sucks!”

Friday night we hit that same Thai food place one more time: Alex, Jason, Hung, Juris and I. We reminisced about the good ol’ days of the first season of South Park. And Saturday Alex and I flew home.

And now I’m back in my basement office/hole, trying to get back in the flow of actual work. Sounds like we may do another trip in May during JavaOne, since half of my team will already be there anyway (Alex and Geert are both presenters).

Hacking on java.lang.String

This week at Terracotta we accomplished something that two weeks ago I thought was both impossible and dangerous – we instrumented java.lang.String to be compressible.

What, you ask, the heck is going on? String doesn’t implement any interface called JavaLangString. It doesn’t have a __tc_decompress() method. String is final and immutable! It has to be for thread-safety! Are you mad?

I offered these same objections to my boss two weeks ago, but the fact is that something like the above code will be in an upcoming Terracotta release. What makes this all possible is that we instrument Java bytecode on the fly.

Terracotta already does some large String compression when clustering Strings across cluster nodes, but we wanted to improve on that. What if, when we decoded a compressed String data from across the cluster and constructed a String instance on a remote node, we actually kept the String instance compressed until it absolutely needed to decompress to be read?

Through the voodoo black magic of bytecode instrumentation, we have accomplished this and made it completely transparent to the application’s use of String. Here’s a basic outline of what happens:

  • data about clustered compressed String is sent over the wire (we call it “hydration”) and decoded at one of the cluster nodes. The following data is encoded:
    • actual compressed String data, in byte[] array form
    • uncompressed String length (int)
    • String hashcode (int)
  • when decoded, the compressed byte[] array is encoded into a char[] array – basically two bytes can fit into a single char.
  • a String instance is constructed with that char[] array, and also the uncompressed length and original hashcode

So, if the resulting String were actually read, it would be gibberish (if it were displayable at all). (By “read” I mean, if it’s internal character content needed to be accessed.) That’s why we need to instrument the String class – if and when the String needs to be read, we have instrumented it so that it knows how to decompress itself. We have basically added a new private boolean field (indicating if compressed or not) a new constructor and some additional methods.

There’s a lot going on here so l’ll point a few things out. First, we intercept any field-level access of the private internal char[] value and we route it through this __tc_getvalue() getter method. This is how we transparently decompress the contents no matter how the String instance is accessed.

Secondly, there are important concurrency issues in play here. String is thread-safe because it is immutable, or was until we got our grubby mits on it. We need for String to remain both thread-safe and also highly concurrent, so we wanted to avoid synchronized blocks. Our solution?

We have instrumented value to be volatile, as well as our new $__tc_compressed field. Now if you once more examine the __tc_getvalue() method above, you’ll see that there is a benign race condition. It’s possible two threads could both decompress and set the value field. That’s fine, since the decompression is deterministic and outputs the same result each time. Because the fields are volatile, once they are set those changes should be visible to all other threads. What should not ever happen is, no thread should attempt to decompress the characters once they are already decompressed – the call to StringCompressionUtil inspects the data to see if it has already been decompressed, and if it has it returns null.

The above methods don’t exist anywhere in source code per se. Rather, we use ASM to add the bytecode for these new methods dynamically to the String class as it is loaded at runtime. So what we actually have somewhere is code that looks like this:

This creates the bytecode to implement the __tc_decompress() method.

When does an instrumented String instance need to decompress itself? Basically, when it’s internal char[] array needs to be accessed. It does not need to decompress when length() method is called, because we have set the uncompressed length when we constructed the String. It does not need to decompress when hashCode() is called, since we have preset the hash code, so that means a compressed String can sit around in a Map all day long without needing to decompress. We even have plans to muck with the equals() method – if two Strings are unequal due to hash code we can exit early and avoid decompressing them to compare character content.

Let’s look once more at the code way at the top:

Because we have instrumented String, at runtime instances of String do implement the interface JavaLangString. So, within Terracotta we can cast an instance of String to JavaLangString, which is an interface containing the __tc_decompress() method. In this way we avoid using reflection. One interesting thing to note is that we first have to upcast to Object to trick the compiler – at compile time the compiler knows (or thinks it knows) that String does not implement this interface, so it errors.

In conclusion, a bunch of us were talking after Alex’s Terracotta talk last night, and someone made the point that, with any good technology, your users will inevitably take advantage of it in a way that you never dreamed of. And that, I think, is what Terracotta is doing.

First Month as a Terracotta Developer

Somehow my first month at Terracotta has already flown by. Here’s what I’ve been up to.

Mostly I’ve been testing. We have a very cool distributed testing framework called “Droid”, and using it we’ve written a number of test scripts (in Groovy) to simulate some real-world use cases and see where we can make improvements. Droid, in a nutshell, allows us to easily and automatically run tests on multiple nodes, and provides means of synchronizing those nodes. In fact, Droid achieves this clustered synchronization using Terracotta, which just goes to show you that we’re not afraid to eat our own dog food.

In the process of working on this I’ve gotten to dive fairly deep into both Groovy and the java.util.concurrent classes, both of which I’ve enjoyed immensely. My humble contribution to Droid was to add some Groovy utilities enabling us to set up multiple test “phases”, which are simply points at which the nodes all synchronize by waiting on a CyclicBarrier. Our test scripts can now register closures to run during a phase, which cleans up the scripts themselves somewhat. Droid already had the ability to “kill” a test node, which was implemented rather ingeniously by using a cluster-wide (via Terracotta) Object to wait on and notify all – basically a daemon thread in each node would wait on this object, and when notified would check to see if it had been killed and should do a System.exit(-1). Building upon this, in my Groovy utilities I added methods to easily set up a kill phase, in which all testing would be suspended while a node was killed off – this leverages the Runnable CyclicBarrier action, which is run before any threads are released from the barrier.

Also as part of all of this, I’ve become reacquainted with Unix. Terracotta has an impressive array of “perf” machines dedicated entirely to continuous performance testing. Some are tuned to act as client L1 nodes, other more powerful multicore machines are tuned to act as the L2 servers. These machines can be reserved, scheduled and used all remotely. So my third week was largely spent actually running some of these distributed tests on multiple perf machines, which involved ssh and sudo.

This week, based in part on some findings from running these tests, I have finally begun (along with Alex) diving into actual production code to see what improvements can be made. In particular, we are looking into the case where cache values are comprised of very large strings, for example large XML documents, and we are investigating how we can improve our string compression code, or whether we can avoid unnecessarily faulting String values into an L1 node.

I love working at Terracotta. I love it! I could go on and on about other cool stuff that’s happening, but I won’t. Suffice to say, the technologies we use and the problems we are tackling are fascinating, to say the least. All of my colleagues are both very sharp and very down-to-earth – no power trips going on that I can tell. Working at home is wonderful – it’s so nice to just take a 15 minute break and walk my son to preschool, or have lunch with my family. Or, in theory, work in my boxers, not that I’ve done that, yet. And working with Alex again rocks.

Essential Mac Software for the Terracotta Developer

Nearing the end of my second week on the job at Terracotta. I’m going to try to document here the software that I use, much of which I had to install. Hopefully this will streamline things for the next newbie. If you’re like me and are making the transition from a PC, you may find my handy-dandy Mac transition guide useful.

Terracotta

General

  • Firefox (download)
  • Eclipse Java IDE
  • Adium for IM (supports Yahoo, GTalk, etc)
  • Skype for video chat
  • Eyebeam soft phone (this is the non-free version of the free X-Lite soft phone, apparently with Leopard we must use this. You’ll need to get a download URL and license from the Terracotta help desk.)
  • TextWrangler text editor
  • TextMate not free but great text editor
  • NetNewsWire feed reader
  • Growl (download) system alerts, integrates with Adium, NetNewsWire, Elluminate and other Mac software
  • NeoOffice (download) free OpenOffice for Mac (was preinstalled on my MacBook)
  • iWork office suite, not free
  • iStat menu bar meters for cpu, etc.
  • Terminal (preinstalled) which as of Leopard is now tabbed
  • Quicksilver
  • OmniGraffle – graphics editor du jour
  • Cisco AnyConnect VPN Client (preinstalled)

Java

Eclipse Plugins (all of these were installed via Eclipse’s Update Manager)

  • Terracotta Plugin (URL for update manager) – eases integrating Terracotta into a project
  • ASM (URL) – you’ll want to install the ASM Framework and Bytecode Outline
  • Groovy (URL) – syntax highlighting and whatnot
  • Subclipse (URL) – Subversion support
  • Maven (URL) – needed to resolve Eclipse inter-project dependencies using Maven pom.xml’s rather than Eclipse .classpath or .plugin files. (Not to be confused with Maven Eclipse plugin, a Maven plugin for generating Eclipse project files.)
  • FindBugs (URL)
  • QuickREx (URL) – regex
  • Implementors (URL) – quickly navigate between interfaces and implementations, or super and sub classes

Firefox Plugins

Making the Transition from a PC to a Mac

One of the many things I am learning as a Terracotta developer is how to do my work on a Mac. Prior to this job, I had developed software professionally only on PC’s. (I did do all of my Java development in grad school using Solaris on Sparc Stations, but that feels likes a long time ago now.)

Below is my attempt to document (i.e. sear into my brain) all of the various shortcuts I’m accustomed to using with Windows, and their equivalents in Mac. There’s already some good documentation on this subject out there, but this is tailored to my experience coming from a PC. This page will likely improve as I learn more.

So far I love my MacBook! I’ve been joking for weeks that I can’t wait to join the cult, and I’ve not been disappointed.

Item or Task Windows Mac
system settings Control Panel System Preferences (under Apple icon menu)
file system Windows Explorer The Finder
manage open apps Taskbar The Dock
switch between apps alt-tab -tab
change password ctrl-alt-del, change password System Prefs -> Account
context menus right mouse click Ctrl-click
Scrolling (laptop) use edge of mousepad two-fingers on mouse
cut ctrl-x ⌘-x
copy ctrl-c ⌘-c
paste ctrl-v ⌘-v
select all ctrl-a ⌘-a
quit File->Quit ⌘-q
hide current window minimize window ⌘-h
show desktop minimize all F11
save File->save ⌘-s
create shortcut various drag icon onto Dock

Eclipse
To pass args to Eclipse (ex: -vmargs -Xms128M -Xmx512M -XX:PermSize=64M -XX:MaxPermSize=128M), find the Eclipse icon in the Finder, hold the control key down and click on the icon, select “Show Package Contents”, find the Eclipse.ini file in the new window, and open with a text editor. (This is contained within the Eclipse application bundle.)

Misc
Taking Screenshots in Mac