Does object persistence complicate architectures with existing data models?

Brian LeGros | October 7th, 2007 | programming  

Recently at work we’ve been using a lot of ORM solutions on the Java side for our service APIs. Our intent is to focus the interaction we have with our DB resources to that of object persistence rather than ad-hoc data access. So technology-wise the transition has involved going from Spring JDBC to things like iBatis and Hibernate.

To take a step back, let me qualify what I consider to be object persistence versus data access. I see data access as the direct use of a data model (e.g. - relation model via an RDBMS, hierarchical model via XML, etc), by an application. Typically I categorize data access code as code that working directly with the data model’s query language (e.g. - SQL, XQuery/XPath, etc). Now, it usually benefits me when working in an object-oriented paradigm to find some type of mapping between the data model and the application’s object model, which is where I see the desire to migrate to an object persistence paradigm for data access. In fact, I consider object persistence just a subset and abstraction of the term data access I’ve defined above. For me object persistence implies that data access is distilled into a simple set of conventions and/or configurable options to allows objects from an application’s domain to gain basic CRUD (create, read, updated, delete) behaviors. In my mind, a few popular patterns for object persistence come to mind (ActiveRecord and Data Mapper) but I get the impression that what is perceived as a mature object persistence solution is a solution which completely abstracts the means by which to directly interact with the data model.

I don’t necessarily think object persistence is a bad solution, in fact, I can see a lot of productivity coming from a reduced set of work when building an integration layer into your application. What I wonder about is the following:

  1. When an object persistence solution should be selected for use with an application.
  2. The amount of productivity gained from using an object persistence solution.
  3. The degree to which an object persistence solution should abstract direct data access.

Let’s take the example of an application being developed from scratch without an existing relational model. I think the 3 points from above are easily answered by looking at how popular the Rails framework has become. I think when building an application anew, having an abstraction for the relational model is great approach. As a developer you can focus solely on your object model which, in my opinion, ideally means more time for working on modeling a domain and deciding on your application’s architecture versus how its persisted.

Unfortunately, I think a majority of shops that like to apply the “enterprise” buzzword to their software infrastructure, don’t have a luxury of building their data models from scratch. Let’s take the example of building an application from scratch using an existing relational model. For this example I think there are a few divisions I have to make before forming my analysis:

  1. The relational model was constructed focusing on standards (i.e. - SQL DDL) and over time has become fairly normalized.
  2. The relational model was constructed utilizing vendor specific objects and over time have become fairly normalized.
  3. The relational model was constructed utilizing standards and vendor specific objects and over time has become more disheveled than normalized.

For the first division, I think an object persistence solution is extremely useful. In my opinion, using a solution that implements the Data Mapper pattern will result in productivity because the implication is that the data access being performed isn’t that complicated. I also think that abstracting out direct data access may be beneficial so that it motivates the keepers of the relational model to maintain all of the work they’ve put into normalization moving forward.

For the second division, I think similar assumptions can be made as they were for the first division. From what I can tell it seems like most “mature” object persistence solutions provide you a means by which to work with vendor specific objects (e.g. - stored procedures, custom objects, rules, etc). I think normalization will save the day as long as the vendor objects in the relational model don’t contain business logic which hasn’t found its way into your application when it should.

This point I feel leads us to the third division; your shit is all mixed up from a relational perspective. I think object persistence can still be considered for use with an application of this type, but the complexity of the application should come into play as well when making the decision. I think a great gauge of whether or not to go the object persistence route is the complexity of your domain. If I find that I’ve got 2 domain classes with a simple composition relationship, I don’t think using an object persistence solution is going to make me productive; heck, the domain may even be overkill. As far as direct data access goes, if normalization isn’t there to provide some structure to the relational model, the implication, at least in my company’s case, is that vendor objects were used to supplement the benefits you get from normalization. This means that business logic that possibly should be contained in the application is tucked away in a vendor object; short of rewriting that vendor object, you may need to reuse the functionality it provides. Direct data access at this point is the only option from what I can tell (e.g. - you need to use an inline non-deterministic SQL function as criteria for a join clause in a SELECT statement). From what I’ve seen in iBatis, object persistence with direct data access is definitely a real and usable option.

As one final point, I think utilizing an object persistence solution, especially in this third division, needs to be approached with a lot of caution when dealing with an existing data model. The forces that drove the creation of the existing data model are not the forces that are driving the creation of the application with which you need to integrate. From what I can tell, in the implementations I seen at work, the object persistence API can definitely start to drive, at a minimum, the way the application domain is being constructed. I’m not just talking about semantics, I’m also talking about data encapsulation and relationships. From what I can tell, based on your architectural approach (e.g. - fat models), this can result in a some pretty strange object behaviors emerging. I don’t have any good examples, so that pretty much just makes this paragraph bunk, so hack these last few sentences up to an opinion. I would encourage readers, however, to do my work for me; show me some examples where you find this to be true.

Overall, I really like the idea of using object persistence in my applications, if I feel the complexity of the application and the data model being used warrant it. I do think sometimes that ORM tools are overused, but as employed developers I still think it comes back to finding the right set of tools for the job and being productive. If object persistence isn’t going to fulfill those requirements, just remember that it’s ok to not use one.

NOTE : I read a really good blog post about ActiveRecord and Rails yesterday that motivated me to finish this post. Even though this post isn’t about patterns in object persistence, I think it’s a great read if you have some time.

cf.objective() 2007 : ColdBox Re-design Discussions

admin | May 6th, 2007 | conferences  

This wasn’t a session at the conference, but it caused me to miss a few sessions today. I wasn’t that impressed with the topic selection today anyway so this was more fun and more educational. I was able to catch up with Luis Majano for almost 4-5 hours today. I interviewed him at my previous job and really got a chance to know him better today. He humored me and gave me a full walk-through of ColdBox over an espresso. Wow, he has really found a good balance between convention and configuration I think. I really think that ColdBox has the ability to do for ColdFusion what Rails did for Ruby, or least to certain degree. Luis has built-in more tools than I can think of to help with productivity and debugging. ColdBox has framework configuration options available in XML, but the bulk of the work done by the application programmer is done by using convention. ColdBox has tools to generate a working application skeleton, a robust caching mechanism, and an amazingly sexy debugging utility. What even crazier is that ColdBox gains most of its power from the massive number of plug-ins available for the framework and the ease in which they can be created. ColdBox comes with support for Java localization and layout management out of the box and even better, if you want they can be turned off. There are tons of things I’m leaving out but I would definitely suggest checking it out. I’m seriously considering it over Fusebox after talking to Luis.

In an effort to better understand how the framework works and to help Luis with the 2.0 release, we sat down and did a domain model analysis to see if we could develop a more descriptive domain model. We made some good progress and will hopefully pick back up tomorrow.

The sheer amount of work that Luis has put into this framework is amazing. With the current documentation available for the project and Luis’ robust approach to product management, this framework is not going away. The tools alone make it worth using this framework. Definitely check it out.

cf.objective() 2007 : Real World SOA: Building Services with ColdSpring and Transfer

admin | May 6th, 2007 | conferences  

Well, in my normal indecisive fashion, I’m giving the CF community its check mark back. Sean Corfeild fails to disappoint. Even though he’s not working with Adobe anymore, he went into the details of how Adobe.com absorbed Macromedia.com and built a robust service layer used by “tons” of web applications. He outlined the basic necessities of building a SOA as below:

  • Data Dictionary (aka Canonical Data Model)
  • Service Directory for discovery purposes
  • Creation of software APIs as services (SaaS)

He gave some pretty cool statistics regarding SOA growth in the USA. He showed a study which revealed that SOA-based applications have grown by 20% over the last year and are expected to increase by 6% annually. His prime example of this was SalesForce.com, but I’ve heard nothing but bad things about their success, so who knows.

Sean outlined the distinction of 4 tiers commonly found in SOA:

  • Controller/Presentation
  • Service
  • Business/Local Model
  • Database/Persistence

In Adobe.com they implemented all of the tiers using ColdFusion and Oracle for the persistence tier. Here are a few points he made regarding good practices:

  • When creating services for the Service Tier, try to use stateless services which can be used to emulate business processes.
  • When requesting a service, authentication should be a major concern. Sean suggested using SSO solutions with randomized tokens being passed. When using this approach, it was mentioned that a policy for token clean-up should be addressed up-front.
  • When constructing datatypes for the canonical data model, consider using primitive types which are shared between environments.
  • Perform a full analysis of the protocols available in your environment and the ability of each environment to work with that protocol (e.g. - SOAP and Java not as easy as it may seem whereas with CF it’s much easier).
  • When creating the canonical data model, expose the minimum amount of commonality in your data. Exposing information specific to some systems but not other may cause dependencies though misuse of the API and accidental tight coupling with consumers.
  • When creating a common exception data model, keep the means by which an error is reported as simple as possible. Distinguish success and failure. For success, return the result of the RPC call. For failure, return a basic exception model containing a type, message, and details. In doing so, programmatic mappings can occur as well human readable messages.
  • Solving caching concerns are difficult. If you can utilize an existing commercial protocol for messaging, like JMS, do so to avoid having to write code for this problem.
  • Unit test the crap out of everything.
  • Don’t allow legacy data models to dictate your canonical data model. Start from scracth to avoid being stuck in stale models.
  • Avoid SOA explosion and extreme API granularity.

It was funny to see Sean say that they chose REST’ful web services instead of JMS for internal messaging. His reason was they could get things done more quickly with web services, but acknowledged JMS would have been a better solution and they got lazy.

This talk was extremely reassuring and a just a great talk. Great job Sean.

cf.objective() 2007 : The CFEclipse Project

admin | May 6th, 2007 | conferences  

I’m really glad my co-workers said they wanted me to go to this talk. Mark Drew has a hilariously dry sense of belittling humor that is awesome. He showed off the latest features in CFEclipse 1.3.whatever. There was a lot of stuff I didn’t know. This is what I can remember from today since I didn’t take as many notes:

  • Ability to drag stylesheet, script, and image files into any page and have the HTML for them auto-generated
  • Integration of ftp and sftp into the File Explorer view
  • Crazy templating in snippets; definitely something we should do at work to make writing CFML easier. We can even share these on a network drive to keep it simple.
  • Scribble pad integration actually working was cool

The new great features for CFEclipse were:

  • The inclusion of the CFUnit/CFCUnit test runner in the latest update site release as an optional add-on
  • The addition of an XML Framework configuration tool which can be extended via XML by the community

Some of the staples in the CF community have put together templates for Fusebox, Model-Glue, Mach-II, Transfer, Reactor, and ColdSpring. The basic idea of the optional add-on is that a view is now available to show you elements specific to the framework that uses XML to configure itself. Mark had not added Transfer support as of this morning; it took him roughly 15 minutes to do so via the extensibility markup that comes with the add-on. Additionally, you can script common actions and practices into the add-on so the community can contribute their favorite ways to configure a framework with a click of a button. In the end, you’re writing XML to configure a tool to abstract XML configurations for CF frameworks. The idea is cool to me, but we’ll see how it goes.

Oh BTW, it also now comes with an XPath query testing tool. You paste an XML fragment into a textarea, enter an XPath expression in a textbox, and then hit “Run”. BOOM, you’re done :)