ORM can lead to inflexibility; Terracotta can help
Okay, granted, I’m biased, I work for Terracotta. Be that as it may, I’d like to share some experiences my teammates and I have had using Hibernate recently while developing a web app.
First, some brief background. We are developing a “reference” web app at Terracotta to use to promote and explore the Sessions Clustering Use Case which we are working to nail. The app is an online exam-taking application, with the goal of supporting 40,000 concurrent users. I’ve blogged about this before, and you can read about the technology stack we settled on. Development has been done primarily by myself and my teammates Geert Bevin, Abhishek Sanoujam, and our supervisor Alex Miller.
Hibernate is wonderful, and it is an integral part of our web app. It feels to me like we got moving pretty quickly using Hibernate for persistence of our domain objects. For ORM, it’s unbeatable.
But the thing I noticed is, there’s just no avoiding the fact that whatever your domain POJO’s are that need to be persisted, chances are good that the use of ORM will impose some constraints on how you must write those POJO’s. I have two examples of this to share.
Example One – Generics
First, we have an exam Section class which, conceptually, is a container for either multiple sub sections, or Questions, but not both. The ideal solution would be to define Section as this (JPA annotations omitted):
where TestContent is an interface implemented by both Section and Question. Thus, an instance of Section could be declared as having type of either Section
However, at runtime (when starting Tomcat), Hibernate (the JPA provider) threw an exception pointing out that Section had an unbounded type (or something like that). After a little digging around on the internet, I found a forum where someone explained that an Entity cannot have a generic type, because it’s not known until instantiation time what the linked Entity will be.
Therefore I had to compromise. I modified Section, removing it’s generic type and adding two explicit collections, one for Questions, one for Sections.
This is less than ideal because the Section API itself doesn’t naturally prevent a single Section instance from having both sub-sections and questions, even though we don’t want to allow this.
Example Two – complex object tree
Similarly, for my other example, one of the constraints is that a Question must have exactly one correct choice (from among it’s two or more choices). So our first inclination was to structure the Question class thusly:
But this caused problems when saving an edited Exam which had had a Question added to it. I no longer have the stack trace handy, but the gist of the Hibernate exception was that a transient (unsaved) object was detected in the object graph being merged (updated).
Alex and I dug in and finally examined the generated database schema. What we saw was that the QUESTION table had a CORRECT_CHOICE column which was a foreign key into another table, QUESTION_CHOICE I think it was. Alex and I theorized that there was a possible ordering problem in updating an Exam with a new Question and Choices – what if Hibernate attempted to set the CORRECT_CHOICE foreign key before inserting the new choices for the question?
I’m not 100% positive that’s the correct explanation, but in any case Alex made the executive decision to simplify our domain model and not spend any more time debugging. We added a boolean “isCorrect” property to Choice, and removed the “correctChoice” reference from Question:
Problem solved – we no longer got the Hibernate exception. But, as Abhishek pointed out, our domain objects no longer enforced the constraint that a question could have only one correct choice. With the updated classes, nothing would prevent instantiating a question with multiple choices marked as correct. This put the burden on additional validation code to enforce this constraint, and overall is just less than ideal.
How Terracotta Can Help
The point I am agonizingly slowly building to is, I think it’s acceptable to have these constraints on our persistent domain objects, but only on the ones that should be persisted. An anti-pattern that we at Terracotta have seen again and again is the misuse of the database and ORM to persist state that really does not belong in the System of Record, but rather is transient state that must be persisted only to scale applications by keeping the applications stateless. One of the Terracotta co-founders, Orion, coined the term “State Monster” to describe this abuse of the db, and recently Wille Faler wrote a very good blog describing this.
Terracotta can help by providing an alternative to making apps stateless for scalability purposes. With Terracotta, go ahead and write your application in the most natural way, including shared state that is only transient. Consider this helpful graph about data lifetimes when deciding what state belongs in the SOR and what state is merely transient or pending. Then, use Terracotta to both cluster and persist the POJOs that don’t belong in the SOR. The advantage is that Terracotta does not impose any constraints on the API of the sorts I have written about here – generics are fine, arbitrarily complex object graphs are no problem. Terracotta clustered objects don’t even have to implement Serializable.
Brady Hegberg:
Out of curiosity – is there a reason you wouldn’t use something like this Hibernate composite pattern for #1?
http://www.theresearchkitchen.com/blog/archives/57
I’m intrigued by your description of Terracotta though and will check it out.
8 October 2008, 3:26 pmScott:
@Brady – thanks for the comment. I don’t think that example you linked to is quite what I would need. The author didn’t write the non-leaf subclass (Division) so it’s hard to tell, and also the base class (OrganizationUnit) doesn’t have a parameterized type. I may be missing something – his formatting is hard on the eyes! In general, the Composite Pattern doesn’t restrict an instance of the non-leaf (compositing) entity to containing only a certain type of leaf entity, which is what I require with my Section above: ideally I want to enforce that an instance of Section can contain EITHER child sections OR questions, but not both.
His example has gotten me thinking that there may be a way to achieve this with a modified version of what he wrote, but really the bigger point I was trying to make is that many POJO’s shouldn’t be persisted to the db, but only represent some intermediate state and only require some intermediate storage, and Terracotta is a good fit there because Terracotta does not impose any constraints on the API of those POJO’s. Just look at how cluttered the code is in the example you linked to, with all those annotations. That may be acceptable if those classes represent entities that *belong* in the SOR, which they probably do. But it may be worthwhile to free other code from ORM library constraints, if you can just transparently plug Terracotta in at the jvm level and let it handle intermediate persistence for you (not to mention transparent clustering as well).
8 October 2008, 3:58 pm