Loose Coupling in a Service-Oriented Architecture - 4.2.2 Heterogeneous Data Types (Page 3 of 4 )
Now, letís discuss my favorite example of loose coupling: the harmonization of data types over distributed systems. This topic always leads to much discussion, and understanding it is key to understanding large systems.
There is no doubt that life is a lot easier if data types are shared across all systems. For this reason, harmonizing data types is a ďnaturalĒ approach. In fact, when object-orientation became mainstream, having a common business object model (BOM) became a general goal. But, it turned out that this approach was a recipe for disaster for large systems.
ďWHAT A SHAME THAT WE CANíT HARMONIZE ANYMOREĒ
A while ago, I gave a talk about SOA that included my usual claim that you have to accept that you canít harmonize data types on large systems. A senior systems architect came to me during the coffee break and said, ďIsnít it sad how bad things have become in our industry when we even canít harmonize data types anymore? In the past we were able to analyze and design our systems.Ē It sounded like a criticism of the younger generations of software developers, who are able to do only one thing: copy and paste.
My answer was as follows: ďYou might be right. In our crazy times, we simply do not have enough time for careful and long analysis and design. The marketing guys rule the world, and if we donít deliver in time, we are out of the market (even if we have better quality). But be careful, and donít underestimate the level of complexity we have reached now by connecting systems. Remember, there is no central control any longer, and if there were it wouldnít work. If we had harmonized an address type before we shipped the Internet protocols, the Internet would never have become reality. Large-scale systems need the minimal consensus you can provide to be successful. Note that you still can harmonize data types when things run.Ē
The first reason for the disaster was an organizational one: it was simply not possible to come to an agreement for harmonized types. The views and interests of the different systems were too varied. Because large distributed systems typically have different owners, it was tough to reach agreements. Either you didnít fulfill all interests, or your model became far too complicated, or it simply was never finished. This is a perfect example of ďanalysis paralysisĒ: if you try to achieve perfection when analyzing all requirements, youíll never finish the job.
You might claim that the solution is to introduce a central role (a systems architect or a ďmodel masterĒ) that resolves all open questions, so that one common BOM with harmonized data types becomes a reality. But then, youíll run into another fundamental problem: different systems enhance differently. Say you create a harmonized data type for customers. Later, a billing system might need two new customer attributes to deal with different tax rates, while a CRM system might introduce new forms of electronic addresses, and an offering system might need attributes to deal with privacy protection. If a customer data type is shared among all your systems (including systems not interested in any of these extensions), all the systems will have to be updated accordingly to reflect each change, and the customer data type will become more and more complicated.
Sooner or later, the price of harmonization becomes too high. Keeping all the systems in sync is simply too expensive in terms of time and money. And even if you manage to succeed, your next company merger will introduce heterogeneity again!
Common BOMs do not scale because they lead to a coupling of systems that is too tight. As a consequence, you have to accept the fact that data types on large distributed systems will not be harmonized. In decoupled large systems, data types differ (see Figure 4-2).
Figure 4-2. Decoupling by using different data types
Again, there is a price to pay for this decision: if data types are not harmonized, you need data type mappings (which include technical and semantic aspects). Although mapping adds complexity, it is a good sign in large systems because it demonstrates that components are decoupled.
The usual approach is that a service provider defines the data types used by the services it provides (which might be ruled by some general conventions and constraints). The service consumers have to accept these types. Note that a service consumer should avoid using the providerís data types in its own source code. Instead, a consumer should have a thin mapping layer to map the providerís data types to its own data types. See Section 12.3.1 for a detailed explanation of why this is important.
Again, there are two sides to introducing this form of loose coupling in SOA (or distributed systems in general). Having no common business data model has pros and cons:
The advantage is that systems can modify their data types without directly affecting other systems (modified service interfaces affect only corresponding consumers).
The drawback is that you have to map data types from one system to another.
Note that you will need some fundamental data types to be shared between all applications. But to promote loose coupling, fundamental data types harmonized for all services should usually be very basic. The most complicated common data type Iíve seen a phone company introduce in a SOA landscape was a data type for a phone number (a structure/record of country code, area code, and local number). The trial to harmonize a common type for addresses (customer addresses, invoice addresses, etc.) failed. One reason was an inability to agree on how to deal with titles of nobility. Another reason was that different systems and tools had different constraints on how to process and print addresses on letters and parcels.
If you are surprised about this low level of harmonization, think about what it means to modify a basic type and roll out the modifications across all systems at the same time (see Section 18.4.9 for details). Inpractice, fundamental service data types must be stable.
Does this mean that you canít have harmonized address data types in a SOA? Not necessarily. If you are able to harmonize, do it. Harmonization helps. However, donít fall into the trap of requiring that data types be harmonized. This approach doesnít scale.
If you canít harmonize an address type, does this mean that all consumers have to deal with the differences between multiple address types? No. The usual approach in SOA is to introduce a composed service that allows you to query and modify addresses (composed services are discussed in Chapter 6). This service then deals with differences between the backend systems by mapping the data appropriately.
Note that with this approach, thereís still no need to have one common view to addresses. If you get new requirements, you can simply introduce a second address service mapping the additional attributes to the different backends. Existing consumers that donít share the additional requirements will not be affected.