Composite UI for Service Oriented Systems – Messaging and fault tolerance


This blogpost is part of a larger blog post series. Get the overview of the content here.


Consider a classical application approach where clients invoke remote procedure calls on the server. Now, what happens to the initiating request when a crash occurs? For example when the IIS App pool recycles or a connection has been refused by the remote host when too many transactions are waiting to time out. The initiating request is lost or if you are lucky somewhere present as cryptic information in a log file.


The messaging approach inverts the transaction management. The transaction is opened before a message is received from the queue. The service bus invokes the code which handles a certain message type. The executing code is also enclosed in the transaction which spans the messaging infrastructure, the executing code and finally the database which contains the business relevant data. If the processing fails at any time the transaction is rolled back and the message (hence the business intent) is put back into the messaging infrastructure.


But the approach above is heavily relying on the distributed transaction coordinator. This is a real beast. Always try to avoid distributed transactions like XA with JTA under Java/Linux or DTC under Windows (two-phase commit protocol) if possible. Especially when you are working in a Greenfield system. Design the process so that messages or their consumptions are idempotent. If you need to rollback a business operation then the business must support compensation actions. These can be modeled with messages pretty easily. Talk to your domain experts! For larger brown field applications it can sometimes be helpful, for example during the refactoring and redesign process, to be able to rely on distributed transactions. The distributed transactions will then span transactions over multiple IO resources such as the transport itself and involved databases (if they support two-phase commit). But distributed transactions have large impact on the processing speed and are cumbersome to manage when they fail.


In order to not rely on distributed transactions you could use native transactions available in your transport layer and native transactions available in your database. If the database operation fails your infrastructure doesn’t ACK the message in the transport which instructs the transport to resend the message again.

In the next post we cover more messaging patterns.

About the author

Daniel Marbach


Recent Posts