Weekly Summary

I struggled last week. It was just one of those weeks where nothing seemed to be easy. The week started off nicely with me adding some more functionality to our distributed cache testing framework, that’s a peripheral and relatively new code base that I’m pretty familiar and comfortable with by now. But after that things started queuing up and I couldn’t close anything out.

First, I was asked to take a look at a customer issue involving Serialization of Terracotta-instrumented ConcurrentHashMap instances. The instrumenting we do on that class breaks Serialization of CHM instances between non-Terracotta processes and Terracotta processes. I’d like to blog about this separately, and actually I think someone smarter than me could do a PhD dissertation on how Terracotta instruments CHM (there’s a lot of it) and why it’s necessary. So in two days I was really only able to evaluate the bug, get it reproduced in a test case, and decide on an approach, but was not able to actually make a fix. That was disappointing.

Next, I was asked to see about removing the JSR 107 dependency from our ehcache TIM. By this time it was later in the week and I was packing for our trip to Florida, and I only finally found what I think is the problem while en route to Florida without an internet connection.

Thirdly, a teammate and India and I have been trying to do some followup testing on the initial cache testing I last blogged about, but it has been going slower than we’d like due to various roadblocks too boring to mention. Just a little while ago I committed some more functionality to our distributed testing framework which will allow us to vary the number of Segments used in CHM instances when we test, and that should’ve been done last week.

This week I’m going to make a concerted effort to eliminate potential distractions, and to better manage all the various feeds that bombard me. IM has to be on pretty much all the time, so my coworkers can reach me, but otherwise this week I’m going to turn of Twitter, RSS feeds and all e-mail for long stretches at a time and get…things…done! I’m also going to go to bed early (soon) and try to get up very early, take a walk to wake myself up, and get to work early. I’d like to do that all week and see if it helps.

My family and I are in Destin, FL for two weeks. I’m going to work this week and take next week as vacation. I’m excited to be down here and have a change of scenery, but at the same time I’m surrounded by many more potential distractions.

Weekly Summary – Clustered Performance Testing

I haven’t been writing here as much as I’d like to, and truthfully there is a ton of stuff I want to write about! But it’s hard to make the time. Especially when, given extra time, I’d rather just keep working on what I’m doing :)

Over the last couple weeks I’ve been running a set of clustered performance tests using our homemade distributed testing framework, nicknamed “Droid”. I went into some detail about this in my last . The testing I did was to measure the cluster-wide throughput (transactions per second) given one, two, four and eight nodes in a cluster (not counting the Terracotta server as a node). We repeated these test using both ConcurrentHashMap and Ehcache as our distributed cache, and we repeated all of the tests with a new (2.6.2) and older (2.5.4) version of Terracotta.

I had a one-on-one with my boss Steve, also one of the co-founders and originally an engineer himself (now head of engineering). We had an interesting discussion about the testing results, and concurrent testing in general. He reminded me to always be aware of unexpected bottlenecks when testing, and always make sure you’re measuring what you think you’re measuring. For example, we designed the test so that none of the machines would be memory or CPU bound – but did I verify that that was in fact the case? Not really. I just set the jvm memory high enough and hoped for the best. We were really trying to get a feel for how Terracotta distributed lock contention would bog down linear scalability as more L1 nodes were added, so Steve’s point was that we don’t want other unexpected resource constraints to mar the measurements.

Late last week, continuing into this week, I started taking a first swipe at collecting cluster-wide statistics in Droid. We already have single-node statistics, but it would save us (primarily Alex) some time if the framework did the number crunching for us. Of course this means we have to use Terracotta and create another distributed object for doing such collecting and processing.

I also spent some time with a new engineer in India, Himadri, trying to get him started on Droid. His development machine runs Windows and I have a MBP, so there’s been some pain there. In particular, we ran into what turned out to be a known (but not by me) issue in our build process that occurs only in Windows.

In other news, so far I absolutely love working out of my home, but on Thursday two weeks ago I experienced the downside. My internet connection went out. Grrr. So I packed up and drove to my parents’ house, but it was out there, too. Stupid of me – my parents and I both have Charter cable, and it turned out that Charter had an area-wide outage that day. At the time I was very irritated – it’s so easy to hate Charter. I finally decided to go to McAlister’s Deli, which has free wifi. I started my working day at 11 o’clock that day. But it ended up being a great experience: the wifi worked fine, and McAlister’s has sweet tea, which to me is like crack cocaine. My boss Alex and I even ended up meeting there last week for a working lunch. Incidentally, Alex has DSL at his house, so chances are good that we won’t both have an outage at the same time.

This Friday I’m leaving with my family for Florida for two weeks. I’m going to work down there the first week.

Weekly Summary

It was a good week. I finally got all of the automated TC Spring tests to pass for Spring 2.5.4, so I was able to mark that issue done. Terracotta now clusters Spring 2.0.x through 2.5.x. That code base is due for a refactoring, though. Our code for clustering Spring uses AspectWerkz to define join points all over the Spring source code, not just the public API. What this means, as I’ve ranted about before, is that even minor changes to Spring’s source code (as occur even between minor releases such as 2.0.5 and 2.0.8) have broken our clustering code. What I’d like to do, when time permits, is see if we can rewrite our aspects to only use methods of the public Spring API as join points. That should give us a whole lot more stability.

My boss Alex is prepping me to help him do some more performance testing. He recently wrote some great blog entries about that here and here. We met with the product management team this week to brainstorm what sort of testing we want to do, what sort of data they might want to have from a marketing/sales perspective, etc. As Alex pointed out, it’s a tricky thing – this sort of testing always leads to finding bugs, which leads to bug fixes, which invalidates any prior testing and so you have to start over. Luckily, we already have a very capable distributed testing framework, developed in-house by Alex, in which we can pretty easily script tests with Groovy. We can have agents on multiple machines (i.e. L1 nodes, talking to a TC L2 server) and have the agents start workers to run tests. The agents can do things like kill and restart workers, to test having to repartition a distributed cache. Sounds like the first thing we’re going to measure is the load time and then the TPS (transactions per second) for a couple different kinds of distributed caches: ConcurrentHashMap and Ehcache.

We found out this week our next big company-wide gathering in San Francisco will be the week of Oct. 13-18. I’ve already book my flight and hotel room. I’m excited – these trips have so far been a lot of fun.

I did a phone interview for a candidate to join my team. Probably shouldn’t elaborate on that yet, but I will say that Terracotta is very thorough with candidates. When I interviewed back in January, I did five phone interviews, four of them with other engineers, before being invited to come out in person. When I did fly out, I was interviewed by another five people, including the CEO and CTO! Honestly, although it was exhausting, I had a great time! I loved being challenged by, and having conversations with, some very smart and talented people who have produced some amazing software.

New software this week: OmniGraffle, which I’ve heard from everyone is the only graphics editing software you need on a Mac. I’ve got a copy now which I will hopefully be using in the not-too-distant-future to write some more technical blog entries about Terracotta. Also, Alex encouraged us to try out FindBugs, including it’s Eclipse plugin here (update site). I’ve added both of these to my list of essential Mac software for the Terracotta developer.

Weekly Summary – Slow, Painful TC Spring Progress

After two weeks of debugging, on Friday I added five magic lines of code to make the last three automated Spring tests pass. With this change we can say we support Spring up to 2.0.8 in our upcoming 2.6.2 release. We still have targeted Spring 2.5.x support for an upcoming summer release.

Two weeks of debugging, five lines of code, one bug fix. This has to be improved on.

Just for my own amusement, I’m going to try to list and describe all of the things that accounted for all the time spent on this.

By midweek last week, I had finally gotten into a groove with the Eclipse debugger, stepping through one of the failing Spring container tests. By “container” test, we mean parts of the test were actually running in a web app container, Tomcat in this case. Even so, this was not a very good, tight feedback loop. The basic pattern was for me to start the test, then attach to each of the two processes running in tomcat with the debugger. (Luckily, our code was already set up to allow for debugging, although it took me awhile to find and enable the magic property.) Then I just had to step through code in the debugger, not really sure what I was looking for. I had to alternate doing this, first with Spring 2.0.1 in the classpath, since our test was passing for that version, then with Spring 2.0.8 in the classpath, and look for what was going wrong. (We have a way of easily varying the version of a library such as Spring. However, I spent basically all of Tuesday helping Hung debug a problem in our build process concerning those variants.) Each time I alternated Spring variants, I also had to tweak the source lookup in my Eclipse remote debug configuration, to pull up source from the right Spring version.

The test uses our DSO functionality, it’s basically a semi-complete running instance of the DSO client. At one point I found I needed to hit breakpoints that were occuring during DSO bootstrapping, which meant I had to set yet another magic property (in ClassProcessorHelper) to enable debugging which is normally not available prior to TC instrumentation. I had to dig through my IM chat transcripts to find the name and location of that property, which my boss had mentioned once weeks ago, and then enable it, and find through painful trial and error that it didn’t work unless I did a full clean recompile (but it worked like a charm after that).

The code I was stepping through was instrumented using AspectWerkz, although for the most part the breakpoints seemed to work fine. But the amount of code was just vast. All I can say is, I spent probably twelve to fourteen hours of straight debugging on Thursday and Friday, just hitting breakpoints, comparing state, following hunches and wild goose chases and red herrings. In the end I found the Spring class whose source had changed between 2.0.1 and 2.0.5, and again in 2.0.8. (It was org.springframework.aop.config.ScopedProxyBeanDefinitionDecorator, deep in the guts of Spring’s aop framework.)

So in the final tally, I spent unexpected amounts of time

  • helping debug a build problem which prevented us from using the “variants” feature, to vary the Spring library
  • setting up Eclipse projects for different versions of Spring, so I could browse 2.0.1 and 2.0.8 source code side by side
  • tweaking the Spring source lookup for the Eclipse remote debugging configuration, to alternate between 2.0.1 and 2.0.8
  • figuring out how to enable debugging of our container tests
  • figuring out how to enable debugging during DSO client startup
  • trying (in vain) to write a more lightweight unit test to tackle the problem with
  • debugging for many, many hours once it was all working

And that’s not even counting actually learning AspectWerkz and writing a new pointcut and advice, which I would have had to do anyway, but which itself involved some painful trial and error.

And at the end of all of this, I sort of feel like I’ve just patched some big honking beast that I don’t fully understand. I feel like our code as it stands now (the AspectWerkz pointcuts and advices for clustering Spring) are still very tightly coupled to Spring source code in a very fragile way, and it’s only a matter of time before a new minor Spring version comes along and breaks it again. All week I was thinking, there has got to be a better way.

Our code for instrumenting and clustering Spring is really cool and mind-blowing, don’t get me wrong. I never would have thought of it in a million years. But it is a classic case of code that is tightly coupled, not modular at all. There is an impressive amount of test coverage through automated integration and system tests, which we need. But the Spring test suite takes about an hour to run, and that’s not even every test. There is a lot of AspectWerkz advices and pointcuts being used to cluster Spring, and none of them can be run and tested in isolation. You just have to start up the whole shebang and debug.

I consider myself a TDD and refactoring proponent. All week long, the pain of trying to debug this was screaming out to me that this part of the code needed refactoring and unit tests. Honestly, I’m not sure I could have done any worthwhile refactoring and still figured out the problem in the same amount of time. That’s the classic TDD/refactoring chicken and egg problem – when you’re in a time crunch, the prospect of trying to clean up some code in order to make things easier at some undermined future point seems insurmountable, and meanwhile you always tend to mentally downplay the amount of time it will take to just debug and fix the code as is. That’s the fear that comes with TDD.

In an interesting twist, I was talking to my boss Steve, and he asked me why I wasn’t refactoring. My boss wants me to refactor! He didn’t order me to or anything, but he sounds very much in favor of it. He said he almost always regrets not refactoring as he goes.

I think keeping the code clean and testable is so important to the long term health of a large, constantly evolving code base. I mean, we can’t just keep having software developers spend two weeks debugging one bug.

There’s a lot more I want to think about along these lines, but I’ve written enough for now.

Bash and TC Build Hacks I Learned in the Last Two Hours

There’s very good documentation about Terracotta’s in-house TC Build system already. But I’ve been doing some intense debugging with Hung, and have learned some things that I want to write down before I forget.

run without ivy: tcbuild blah blah --no-ivy – I’m assuming this runs faster because it skips using Ivy to check that all dependencies are in place.

run without compiling tcbuild --no-compile blah... when just shuffling some runtime dependency or something.

put environment stuff in .bashrc

check trunk/buildsystem to find things like jruby

For our automated container tests, individual jar files are placed in one huge WAR file. This is not true for ordinary unit tests.

Doing something like ./tcbuild check_one CustomScopedBeanTest --no-ivy > log.txt 2>&1 puts output in a file, and the last part redirects err stream to output stream.

Important shared stuff at /shares/terra/jdk/ such as Java, ant, etc

Grep trick 1: ps -ef | grep java to see details about Java processes running

Grep trick 2: env | grep JAVA to see environment variables I should have set up to run tcbuild

Grep trick 3: find <path> -name <filenamepattern> | xargs grep <searchstring> find all files matching filenamepattern that also have search string within them

find trick: rm -rf `find . -type d -name .svn` remove all .svn directories recursively

~/.tc/appserver is where tomcat is stored during automated tests – may want to remove as sanity check sometimes.

~/.ivy* is where ivy stuff is stored – may want to remove prior to doing total clean rebuild.

Weekly Summary – TC Spring again

This weekly summary actually encompasses the last three weeks. Sigh.

Lots of activity throughout dev is centered around the Terracotta 2.6 and 2.6.1 releases, as well as the upcoming 2.6.2 release.

Primarily I’ve been working on updating Terracotta’s Spring support to 2.5.x. Currently we only support up to 2.0.5. I had thought I had gotten it working up through Spring 2.0.8, but late last week we fixed a bug in our build process which then revealed three failing automated TC Spring tests which were previously (incorrectly) passing. So Spring 2.0.8 is not quite there…but close. Meanwhile, my compadre Nitin had made some changes that got TC working with Spring 2.5, but those changes are not backwards compatible to Spring 2.0.x, so I’m investigating whether they can be merged together somehow. Since we are dependent on the Spring source code in order to instrument their code (by using Aspectwerkz), we are subject to the whims of whatever source code changes occur between even minor releases (such as differences between Spring 2.0.5 and 2.0.8).

The other thing of note that I got accomplished was to respond to this post on our forums about a deadlock occurring in Terracotta L1. The poster had nicely laid it all out for us, with a stack trace excerpt clearly showing the deadlock. My teammates and I reviewed the pertinent class, and I cleaned up a number of synchronization bugs or missing synchronization. The deadlock itself was cleaned up by moving to a CopyOnWriteArrayList for a collection, which previously was being locked while iterating through it (read-only) and doing expensive stuff. The fix will be in 2.6.2 release.

I was without internet connection at my house a couple weeks ago for a few days. I had to do bloody battle with Charter to get that fixed. Ultimately a technician came and found that the line to my house had been put on a splitter at some undetermined point in the past, and so my signal strength was no longer strong enough. Meanwhile, luckily, I was able to go to my parents’ house and get some work done there. Have I mentioned that I love my MacBook Pro, and wireless internet?

10 Eye-Catching DZone Titles With Words Like ‘Naughty’

Writing tired, formulaic Top-10 lists or posts with provocative titles is a shameless way to attract traffic. And I want in on the action! So without further ado, here is my Top Ten Eye-Catching DZone Titles.

  1. 10 Ways to get an article on dzone – I’m being self-referential. That means I’m clever!
  2. I’m Going To Scale My Foot Up Your Ass – (Not suitable to read at work, I’m afraid.)
  3. Ohh regex you naughty naughty boy you – …so very very naughty…
  4. Dethreading a Chicken – Sounds complicated.
  5. Are You User Experienced? – I voted this one up just for the title.
  6. Boo! I’m an Event Manager
  7. Revenge of Hello Terracotta – Aha! Someone wrote a post which was a sequel to another post, and in naming it cleverly referenced old monster movie sequels like Revenge of the Creature or Revenge of the Son of Blacula’s Return. What a funny guy that author must be!
  8. Hell Hath No Fury Like Polymorphism Scorned
  9. You’re Fat and I Hate You – No idea what the post is about, love the title.
  10. Unit testing is dumb – I am shocked, shocked and appalled, at such an obviously inflammatory title! What troglodyte wrote this crap?

Thanks for reading…suckers! :)

Should Guice be used in unit tests redux

My previous post was about whether Guice should be used in unit testing, and Crazy Bob himself commented:

I prefer the simpler unit test. I think the real moral of this story is that unit tests alone are never enough.
Also, you did get the error right away when you started your app. That’s like an automatic test that you needn’t replicate manually (comparable to a compiler check).

I started to reply and my comment quickly grew long enough that I decided to just write another post. So, should Guice be used in unit tests? I hate to be wishy washy but I think the answer is it depends. As happens so often in software development, there are conflicting forces at work. In this case I think two conflicting forces are ease of unit testing versus fast feedback loop.

One thing I do completely agree with Bob on is that unit tests alone are never enough. Alex Miller wrote a great post about that called Weaving the Test Fabric.

Fast feedback loop

Different parts of the test fabric perform differently and are intended to be run at different frequencies. It’s common to only run all of the automated tests once during a nightly build, especially if any of the system tests are long-running and/or resource-intensive enough. It’s common to have a medium-weight suite of tests which a developer is expected to run once prior to committing the code, but which may still take quite a few minutes to run. And I prefer to have a suite of as many automated tests as possible which run fast. If you have a fast running suite of automated tests, then you have a fast feedback loop. I’m talking under twenty seconds, the sort of suite you can run in between every single change, as you work. Such tests most likely make heavy use of mocks and have no dependencies on external resources like db’s, file systems or network connections. Such tests are most likely unit tests.

With Guice, as with any code, the question is: how long can you stand to wait to find out if you implemented, or broke, something?

I would not be able to stand having to start the whole application up every time I was debugging my Guice usage. Bob Lee reminded me that you can still have automated tests that test your usage of Guice, they could be component or system level automated tests, not necessarily unit tests. They could even be included in the fast running suite.

Ease of unit testing

It is so important to keep unit tests, and unit testing, simple. For the sake of all current and future developers working on the code, the barriers to unit testing need to be as few as possible – basically just know JUnit. It’s hard enough as it is to get people to write unit tests. So, having thought about it, now I’m not so sure it’s worth having Guice in unit tests if it burdens my peers. I know how overwhelmed I feel when I work on a project and find I have to learn additional in-house unit test conventions and special TestCase subclasses that I’m expected to start with. It’s an awfully attractive idea to keep unit testing confined to simple JUnit tests, and leave fancier stuff for more complicated integration-style tests.

This is all predicated on the idea that Guice is not mainstream yet, as JUnit is, and would still be a burden to have to learn just to unit test with. If Guice ever becomes more widely used and understood, then I might revert back to my earlier conclusion. After all, Guice is the new new – why not embrace it in unit test code if it is being embraced in production? I like the API very much and think it could easily become part of the unit testing vernacular.

Conclusion

The two forces I mentioned above do not have to be diametrically opposed. Thinking about it now, I could envision component-level or subsystem-level automated tests that do nothing except test the Guice dependency injection for that component. Such a test could still be fast-running if mocks are used as appropriate – Guice itself is performant enough thanks to it’s pure Java implementation. And such tests would naturally force the developer to place the proper Guice annotations in the proper place. Unit tests could remain simple JUnit tests.

Weekly Summary

Last week was shortened by jury duty on Monday. Fortunately, I was never selected from the pool, and on Tuesday I was back to work.

There are (still) a number of monkey failures (such as this one) that I need to get working on.

However, I discovered I could procrastinate tackling those by checking the forums. I decided to try to answer this post (which has since been addressed by a couple of my teammates). Almost two days later, I conceded that it’s really really hard to try to cluster the underlying javax.swing.text.AbstractDocument of a JTextField. I still haven’t got it. (See gkeim’s response for a clever workaround.)

I finished up the week by working on droid. One of my peers was having trouble running a test in which he wanted one of the spawned workers to have a different tc-config than all the others. There are some weird subtleties in passing vm arguments through the agent which are intended for the worker, but as it turned out, I believe the functionality is already there and didn’t require any changes on my part, just an explanation of how to do it.

Should Guice be used in unit tests?

At my Guice presentation recently, I used some small code exercises to demonstrate the basics of Guice. Here is my simplest example, involving a Service, a Client needing a Service injected, a test case, and an Application.

First, the Service:

Next, the Client, and it’s test (JUnit 4):

And finally, an Application that ties it all together using Guice:

During the talk, Jeff Grigg had a question followed by an interesting comment. For his question, he asked me to remove the @Inject attribute from Client and re-run the test. I did, thinking I was about to demonstrate a typical helpful Guice Exception. But the test passed. I was momentarily lost, until I remembered that the unit test did not use Guice at all. I ran the Application just as a sanity check, and it was indeed broken.

Jeff was on it like a hobo on a ham sandwich – I had broken the application but the test still passed. My test did not justify the use of the @Inject attribute.

He was absolutely right. I realize now that I had been following the example from the Guice Developer Day Slides, which had taken a non-Guice example and migrated it to Guice, leaving the JUnit test unmodified so that, like my test above, it continued to manually pass a mock to the client code and did not use Guice. This, I realize now, is fine for a tutorial but it not truly Test Driven Development, where every line of code has some test to justify it’s existence.

Naturally, Jeff raised the question – should Guice then be used in unit tests?

I think the answer is yes, why not? After all, Guice suggests it be thought of as the new new, and that using Guice is no more work than plain old factories.

If Guice were used in unit tests, then the tests would naturally require that the code have the necessary @Inject attributes, or whatever other attributes are necessary for Guice. The only difference would be that different Modules would have to be created for the tests, which contained bindings which differed slightly from the production Module(s).

I tried implementing a new client, strictly adhering to TDD and using Guice in my unit tests. Here is what I came up with.

There are two things I notice. One is that my mission is accomplished – my Client’s @Inject attribute is now truly unit tested – the test will fail without it. As a bonus, Client is slightly simplified in that it no longer needs a constructor which accepts a Service. The other thing I notice is that the test is somewhat more complicated, but really it’s only the addition of a Module and Injector.

My feeling is that it’s worth the extra effort to use Guice in unit tests, if you’re going to embrace Guice in your production code anyway.

In a future post, I’d like to investigate whether there is a better way to integrate Guice with JUnit, alleviating the need to write boilerplate unit testing Modules over and over again. Just tonight I noticed both GuiceBerry and AtUnit which potentially address this and look somewhat promising.