Saturday, November 27, 2004

About O/R Mappers

We all like object-oriented programming, right? SQL, well... it's cool to have SQL code generated. And your domain object model, too. Some Not-Invented-Here experts will even propose to build a home-brewn O/R mapper. After all, they can do a better job than those Hibernate guys, now don't they?

I know O/R mappers look tempting, and even more tempting is to build one on your own, at least to some architecture astronauts. Things might look promising unless... people actually start applying it. But at this point in time, the investment has been made. No way back. And more than once, the developers who built the O/R mapper are not the same who actually use it, and the blame game is about to begin.

I have seen applications stall, projects fail, people quit, companies go south thanks to the overly optimistic usage of home-made, unproven O/R mappers. Building an O/R mapper is a long and painful road. Avoid unnecessary database roundtrips, provide the right caching strategies, don't limit the developer ("but we were able to do subselects in SQL"), optimize SQL, support scalability, build a graphical mapping tool and so on. Some people designing SOA / message-oriented systems just DON'T WANT their database model floating around within the whole application, but that's what is most likely going to happen.

Will it support all kind of old legacy systems, including some bizarre / not-normalized database designs? And yes, sometimes your programmers don't know the consequences of a virtual proxy being expanded. Only database profiling will show what goes on behind the hood. "One Join" is certainly the preferred solution in comparison to "N Selects", but this is porbably not going to happen once you traverse over a 1:N relationship. Will your caching algorithm still work in concurrent/distributed scenarios? And what about reporting? Your report engine might require plain old resultsets, no persistence objects.

All those benefits the architects expected - they just don't turn out that way. "Too much magic", as one of our consultants expressed it. There must be a reason why accessing relations databases is done in SQL. It's just coherent. There is no OO equivalent, that fits. There is no silver bullet.

Now, there are scales of grey, just as there are application scenarios, where O/R mappers do make sense. I recommend considering an O/R mapper if

(1) You have full control over the database design (no old legacy database).
(2) Load and concurrency tests prove that the O/R mapper works in a production scenario.
(3) Your customer favors development speed over future adaptability.
(4) The O/R Mapper supports SQL execution (or a similar kind of query language).
(5) The O/R Mapper is a proven product, and not the pet project of an inhouse architect.

It is also important to distinguish standalone O/R mappers from container-controlled persistence (the later were designed for running on a middle tier). J2EE Container-Managed-Persistence Entity Beans do make sense in a couple of scenarios (and then again they don't make sense in a lot of others, and J2EE architects will rarely ever recommend a 100% CMP EJB approach). And of course, Hibernate and others do a pretty good job as well on N-tier systems.
Summing up, I strongly agree with Clemens Vasters in most of the cases. He states:

I claim that the benefits of explicit mapping exceed those of automatic O/R mapping by far. There's more to code in the beginning, that's pretty much all that speaks for O/R. I've wasted 1 1/2 years on an O/R mapping infrastructure that did everything from clever data retrieval to smartt [sic] caching and we always came back to the simple fact that "just code the damn thing" yields far superior, more manageable and maintainable results.