Friday, 4 March 2011

Command-Query Responsibility Segregation (CQRS)

If you listen to the .Net Rocks! podcast you may have heard the episode where Udi Dahan Clarifies CQRS. CQRS? What’s that all about then?
Health warning: I’m learning here. Errors and misconceptions probable. Jump to References below to go to more authoritative resources.
It looks like CQRS started with Bertrand Meyer as part of his work on the Eiffel programming language. At that time Meyer referred to the principle as Command-Query Separation (CQS). That principle states:
“…that every method should either be a command that performs an action, or a query that returns data to the caller, but not both. In other words, asking a question should not change the answer. More formally, methods should return a value only if they are referentially transparent and hence possess no side effects.” - Wikipedia
The implications of this principle are that if you have a return value you cannot mutate state. It also implies that if you mutate state your return type must be void.
CQRS extends CQS by effectively mandating that when we interact with data stores objects are assigned to one of two ‘types’: Commands and Queries. This division of responsibility extends into the rest of the architecture. You may have separate services dedicated to either commands (e.g. create, update and delete operations) or queries (read-only operations). It can even extend to the level of the data store – you may choose to have separate stores for read-only queries and for commands.
Why is this separation important? Well, it recognises that the two sides have very different requirements:
Command Query
Data Store Store normalised data for transactional updates etc. Store denormalised data for fast and convenient querying (e.g. minimise the number of joins required).
Scalability Scalability may be less important for commands. For example many web systems are skewed towards more frequent read-only operations. Scalability may be very important, especially in web systems. We may want to use caching here.
CQRS recognises that rather than having one unified system where create, read, update and delete operations are treated as being the same it may be better to have two complimentary systems working side by side: one for read-only query operations and another for command operations. The following diagram is somewhat over-simplified but…
CQRS v1.0

Features of the Command side

  • Our ‘data access layer’ for command operations becomes behavioural as opposed to data-centric.
  • Our data transfer objects don’t need to expose internal state – we create a DTO and fire it off with a command.
  • We don’t need to process commands immediately – they can be queued.
  • We might not even require an ‘always on’ connection to the data store.
  • We might use events to manage interaction between components in the command system.

Features of the Query side (the Thin Read Layer)

  • We can create an object model that is optimised for display purposes (e.g. view models).
  • We can create a separate data store that is optimised to meet the needs of the display object model (e.g. data can be denormalised to fit the requirements of specific views).
  • We can optimise data access to prevent round trips to the data layer so that all the data required by a view is returned in one operation.
  • We can optimise read-only queries in isolation from the requirements of update operations (this might be compromised in a non-CQRS system where, for example, an ORM is used for all data access operations).
  • We can bypass direct interaction with the data store and use cached data (although we should be explicit about this when we do it – let the user know how fresh the data is).
  • We can use different levels of caching for different parts of the application (perhaps one screen requires data to be fresher than another).

How do we keep the query data store in sync?

This is where events come in to the picture and I think I’ll leave that for another post!

References