The concept of a custom type is something that is hard to avoid when designing object-oriented software. I’ve always seen custom types as a way to bring organization and semantic meaning to my applications. I find they help me to better describe data, relationships, and behavior especially when implemented in the form of classes. My biggest question, however, when I first began programming was the degree to which I should be utilizing classes. I always end up trying to find a balance between the following two questions:
- To what extent do I utilize a class for semantic purposes?
- To what extent do I utilize a class for reuse?
When does the use of a class for semantic reasons become overkill and when am I designing for too much flexibility? I’ve come to favor semantics over reuse, however I feel both are very important in the creation of successful architectural designs.
So let’s talk about domains. What are they and why do I care? Well, for starters, when I refer to a domain, what I’m actually referring to are the identifiable entities based on a “real-world” problem for which a software application needs to be constructed to act as a solution. Put much simpler, a domain can be a collection of objects from the “real-world” recreated in software to solve a problem. Again this is just my definition, so yours may vary.
Please understand that the purpose of this post, is not how to conduct a domain analysis. There are much smarter people in the world with much better ideas than me that can help you in this area. That being said, below is a UML class diagram describing an example domain for a very simple search engine:
Example 1 – Class Diagram
The Search1 class is used to represent a search executed by a user. The SearchCriteria class represents the possible criteria that can be used to execute a search. SearchEngine1 is a very naive approach at representing some type of Fascade which uses an external search engine to execute a search. In this example, let’s say the utlimate goal of a search is to return some type of collection of strings, each of which represents a URI somehow related to the search criteria provided.
Now in this example, I’ve taken what I feel is a very pragmatic approach to representing the search results in my model; I decided to use a generic collection object of type List (arrays, sets, or any other type of generic collection could be used here, I just chose List). Please note, I make the assumption that the caller already has created the Search and SearchCriteria objects and the External Search Engine will return a List. Below is a simple sequence diagram which shows the order of messages sent within the application:
Example 1 – Sequence Diagram
So in Example 1 I have reused a collection object with a stock set of behavior with which I am familiar and can probably be productive with little effort. On the other hand, I have sacrificed the semantics of my domain, because the notion of search results are nothing more than a generic collection. Let’s say that I want to do something more with search results than List provides. What if I want to sort search results? If it’s a simple ascending or descending alphabetical sort, then yeah, my generic collection can probably handle the work, but what about a sort by domain name or protocol? If we were to implement this using the model from Example 1, the Caller actor would have to take the responsibility (assume the behavior) of sorting the search results because we chose to go with a generic collection. Although coupling remains loose because of the flexibility of different classes which may be typed as List, the Caller’s cohesion will be reduced because it’s assuming the behavior of a domain its consuming. What about this design decision from a maintenance POV; will other developers immediately know where to look for the behavior associated with sorting search results? Is the answer still yes if you have multiple Caller actors which call execute() on the Search object?
Example 2 is an alternative to Example 1 where semantics were chosen over reuse:
Example 2 – Class Diagram
Example 2 – Sequence Diagram
In this example, the External Search Engine is still returning a List, but the SearchEngine2 class is digesting that List and converting it into a SearchResults object (SearchResults could even inherit from List). The SearchResults class has behavior associated with it and now has semantic meaning in the context of the domain. From a maintenance perspective, it is also easy for other developers to see where the behavior for sorting search results resides. Additionally, the behavior associated with SearchResults is completely encapsulated from Caller, so cohesion remains high with both the actor and the object. Coupling on the other hand does get tighter, but I see this as a worth-while compromise to obtain a more semantically meaningful domain.
Similar examples where I see myself choosing semantics over reuse are situations such as reporting. I’ve said this in the past, but reports very commonly represent entities within a domain. In fact, I see a very common behavior associated with reports, rendering report contents (not to be confused with how to deliver the rendered contents of a report). If a report is being generated from a database, for example, why should the calling process be subject to having its cohesion reduced so that an abstract data type can be used (e.g. – a RecordSet class or the like)? How maintainable does the notion of rendering a report become when other calling processes have to utilize the behavior; what if we need to render a report in multiple ways instead of one default way (e.g. – fixed length, comma-delimited, XML)?
Please keep in mind, I’ve said that my preference is choosing semantics over reuse when modeling a domain, but I still see this as a preference. I am not saying that I will never use abstract data types when modeling a domain. I am saying however, if the data, relationships, and behavior associated with domain warrant the use of a custom type, I am much more likely to create that type, than look for an opportunity for reuse. I like to use the “pseudo-metrics” of cohesion and coupling as a check and balance to help gauge the trade-offs I’m making in the domain as well. I have begun to use these principles even more with my current employer than every before. I think the fact that we have to live with our code for quite some time (5+ years in some cases) is definitely another driving factor for my preferences.
As I continue this series, please try to keep an open mind and realize that this post will serve as the basis for other posts as well. I have many more styles to talk about. Thanks for reading this far.
<< Part 2 : Cohesion and Coupling