Does object persistence complicate architectures with existing data models?

Brian LeGros | October 7th, 2007 | programming  

Recently at work we’ve been using a lot of ORM solutions on the Java side for our service APIs. Our intent is to focus the interaction we have with our DB resources to that of object persistence rather than ad-hoc data access. So technology-wise the transition has involved going from Spring JDBC to things like iBatis and Hibernate.

To take a step back, let me qualify what I consider to be object persistence versus data access. I see data access as the direct use of a data model (e.g. - relation model via an RDBMS, hierarchical model via XML, etc), by an application. Typically I categorize data access code as code that working directly with the data model’s query language (e.g. - SQL, XQuery/XPath, etc). Now, it usually benefits me when working in an object-oriented paradigm to find some type of mapping between the data model and the application’s object model, which is where I see the desire to migrate to an object persistence paradigm for data access. In fact, I consider object persistence just a subset and abstraction of the term data access I’ve defined above. For me object persistence implies that data access is distilled into a simple set of conventions and/or configurable options to allows objects from an application’s domain to gain basic CRUD (create, read, updated, delete) behaviors. In my mind, a few popular patterns for object persistence come to mind (ActiveRecord and Data Mapper) but I get the impression that what is perceived as a mature object persistence solution is a solution which completely abstracts the means by which to directly interact with the data model.

I don’t necessarily think object persistence is a bad solution, in fact, I can see a lot of productivity coming from a reduced set of work when building an integration layer into your application. What I wonder about is the following:

  1. When an object persistence solution should be selected for use with an application.
  2. The amount of productivity gained from using an object persistence solution.
  3. The degree to which an object persistence solution should abstract direct data access.

Let’s take the example of an application being developed from scratch without an existing relational model. I think the 3 points from above are easily answered by looking at how popular the Rails framework has become. I think when building an application anew, having an abstraction for the relational model is great approach. As a developer you can focus solely on your object model which, in my opinion, ideally means more time for working on modeling a domain and deciding on your application’s architecture versus how its persisted.

Unfortunately, I think a majority of shops that like to apply the “enterprise” buzzword to their software infrastructure, don’t have a luxury of building their data models from scratch. Let’s take the example of building an application from scratch using an existing relational model. For this example I think there are a few divisions I have to make before forming my analysis:

  1. The relational model was constructed focusing on standards (i.e. - SQL DDL) and over time has become fairly normalized.
  2. The relational model was constructed utilizing vendor specific objects and over time have become fairly normalized.
  3. The relational model was constructed utilizing standards and vendor specific objects and over time has become more disheveled than normalized.

For the first division, I think an object persistence solution is extremely useful. In my opinion, using a solution that implements the Data Mapper pattern will result in productivity because the implication is that the data access being performed isn’t that complicated. I also think that abstracting out direct data access may be beneficial so that it motivates the keepers of the relational model to maintain all of the work they’ve put into normalization moving forward.

For the second division, I think similar assumptions can be made as they were for the first division. From what I can tell it seems like most “mature” object persistence solutions provide you a means by which to work with vendor specific objects (e.g. - stored procedures, custom objects, rules, etc). I think normalization will save the day as long as the vendor objects in the relational model don’t contain business logic which hasn’t found its way into your application when it should.

This point I feel leads us to the third division; your shit is all mixed up from a relational perspective. I think object persistence can still be considered for use with an application of this type, but the complexity of the application should come into play as well when making the decision. I think a great gauge of whether or not to go the object persistence route is the complexity of your domain. If I find that I’ve got 2 domain classes with a simple composition relationship, I don’t think using an object persistence solution is going to make me productive; heck, the domain may even be overkill. As far as direct data access goes, if normalization isn’t there to provide some structure to the relational model, the implication, at least in my company’s case, is that vendor objects were used to supplement the benefits you get from normalization. This means that business logic that possibly should be contained in the application is tucked away in a vendor object; short of rewriting that vendor object, you may need to reuse the functionality it provides. Direct data access at this point is the only option from what I can tell (e.g. - you need to use an inline non-deterministic SQL function as criteria for a join clause in a SELECT statement). From what I’ve seen in iBatis, object persistence with direct data access is definitely a real and usable option.

As one final point, I think utilizing an object persistence solution, especially in this third division, needs to be approached with a lot of caution when dealing with an existing data model. The forces that drove the creation of the existing data model are not the forces that are driving the creation of the application with which you need to integrate. From what I can tell, in the implementations I seen at work, the object persistence API can definitely start to drive, at a minimum, the way the application domain is being constructed. I’m not just talking about semantics, I’m also talking about data encapsulation and relationships. From what I can tell, based on your architectural approach (e.g. - fat models), this can result in a some pretty strange object behaviors emerging. I don’t have any good examples, so that pretty much just makes this paragraph bunk, so hack these last few sentences up to an opinion. I would encourage readers, however, to do my work for me; show me some examples where you find this to be true.

Overall, I really like the idea of using object persistence in my applications, if I feel the complexity of the application and the data model being used warrant it. I do think sometimes that ORM tools are overused, but as employed developers I still think it comes back to finding the right set of tools for the job and being productive. If object persistence isn’t going to fulfill those requirements, just remember that it’s ok to not use one.

NOTE : I read a really good blog post about ActiveRecord and Rails yesterday that motivated me to finish this post. Even though this post isn’t about patterns in object persistence, I think it’s a great read if you have some time.

cf.objective() 2007 : Advanced Transfer ORM Techniques

admin | May 6th, 2007 | conferences  

This session was really cool, especially in spite of the fact that I have never used Transfer and didn’t go to the intro course. I’ve got an idea about how ORM’s work, so with that in mind, here’s my summary.

Mark Mandel went over the following features in Transfer:

  • Caching - Ability to define the scope in which Transfer will cache, in what quantities, and for how long. Cloning is also available so you don’t have to work with the copy of a domain object in the cache. Once you’re done with the clone, you can force its state onto the cached object that generated it and perform operations. Also provided a means to recycle cached objects rather than delete them for later re-creation.
  • Observable Events - Aspects associated with events in Transfer : beforeCreate, afterCreate, beforeUpdate, afterUpdate, beforeDelete, afterDelete, afterNew (no ColdSpring integration for AOP yet)
  • Decorators - Ability wrap a domain object retrieved from Transfer with additional functionality (i.e. - make a value object, heavier weight)
  • Transfer Query Language (TQL) - Mark pretty much said he was trying to copy HQL from Hibernate and from what I could tell, it looked good

In all honesty, the coolest part about this talk was the lack of intimidation you usually see in CF programmers when faced with sessions like this one. They understood the building blocks used in the examples (i.e. - ColdSpring, Transfer, etc). Mark even went into an explanation of how garbage collection works in the JVM as it relates to “soft” references of objects. It was really interesting and there weren’t any jackasses raising their hands saying, “Well I wrote my own garbage collector”! I feel a proud of the ColdFusion community; it really seems like people are starting to get a clue. Well, at least this group of developers.