How I am handling out-of-order events
Using punctuation-based techniques to process out-of-order events
In the world of distributed systems, we often work with events and consumers without thinking about the reality of time when consuming them. We as developers have the tendency to think about events in a sequential manner, but when we are not working in a single-threaded environment, our thinking does not reflect reality.
I struggled quite a bit when I had to take design decisions around how to best tackle the order of events, and I want to share some of my findings.
You can find a link to the demo using the following link: https://github.com/ramihamati/demo_outoforder/tree/main/OutOfOrderDemo.Punctuation
Definition
An out-of-order event is one that is received later then you expected, (or it might have been processed earlier than expected). This may lead to a possible incorrect state in our system.
There are 2 common processing methods used in dealing with this:
- IOP — In Order Processing — usually using a buffer to store multiple events which we can later order before actually dispatching them to be processed.
- OOP — Out Of Order Processing — does not require order maintenance. In the case of ordering requirements, OOP buffers input items until a special condition is satisfied. This condition is supported by progress indicators such as punctuations, low watermarks, or heartbeats.
Several techniques are defined which fall into these 2 categories:
Buffer based approach
Punctuation based approach
Speculation based approach
Approximation based approach
You can read more about them using the links at the bottom of this page.
Context
- I wanted to consume and transform multiple events into aggregate events for special query purposes
- The transformed models are not critical so eventual consistency was good enough.
- I needed a simple solution, buffering was not an option because it introduces an extra level of complexity.
- My events were basic domain actions like created, updated, deleted. This simplified my design
I decided to go along with a punctuation-based approach, using an entity versioning system.
Design Decisions
- The original domain model will have a version property transparently handled by the repository.
- The sent event will contain the entire domain model (as an event model) independent of the action type. This meant that if I receive an Updated event before a Created event, I can use that information to create my transformed model
- Each transformed model will store the versions of all events used to create it.
Implementation — Structure
The project structure I will be presenting has the following structure:
- The api project contains the original domain models
- The listener contains the transformed models
- The common project contains the infrastructure classes helping me to keep the versioning system and also the events and event models.
Implementation — Api
Since the domain model must contain a version, I created an interface enforcing that:
The implementation of the above interface will represent the base class of our mongo entities (I will not go into detail aboutBaseDocument
, it’s a just a base mongo class and you can check it out in the git project)
What is important about my implementation is that I am not letting the user have to handle the version, because he might forget to increase it. Instead, I am doing this in the repository. You can check out the entire code in git, but here is a snippet:
For the API I created a simple controller, which will create and update entities and for each action, it will publish an event. Each event will have an event model and the event model for a specific entity is the same for all actions related to that entity. This is useful because if I am receiving an update event after a created event, I have access to the entire set of information needed for my transformed model.
For the publisher, I am using MassTransit configured for RabbitMQ
Implementation Listener
This part is a bit more complex. Here we have a set of entities that are composed of one or multiple entities. Perhaps it’s easier to start with the result and then exemplify the solution. My goal here is to create one BookExpanded
, a model which contains both book and author details.
So what what are the scenarios that we might have here:
- receive the
BookUpdated
event before theBookCreated
event - receive the
AuthorCreated
event after theBookCreated
/Updated
event - receive the
AuthorUpdated
event after theAuthorCreated
orBookCreated
/Updated
event
As you can see, all the above possibilities can induce an incorrect state in our system. Having the AuthorCreated
event received after the BookCreated
implies that we must store that information somewhere and because of this, we will need a separate copy of the BookAuthor
as an entity
The multi-version document is the base class storing the multiple versions of entities used to create the transformed model.
The versioning system consists of a prefix and the id of the entity, in my case the multi-version document has a base prefix (which you can remove) because for each transformed document there is a main event and the transformed document will store the prefix for that.
The multi-version document has 2 main methods, CanSetVersion
and SetVersion
, which basically checks if there is a lower version and if yes, it set’s the new version. The other 2 derived methods use the predefined prefix CanSetOwnVersion
and SetOwnVersion
The MultiVersionField
is a base class for a versionable property from our transformed class (see the Author
property in the BookExpanded
model). We want to be able to update this property, only if there is a newer version present, but make it seamless
Now I can show the entire BookExpanded model. You can see that to build this model, I am consuming book events, but not author events. Because the author will be stored in a separate copy so we are consuming the built BookAuthor model.
Implementation — Listener Consumers
It’s been a long read, so for those who manage to reach this far, thank you all!
In this section, I will show you the final consumers and how they leverage the final built models
I will show here only one sample, and you can check out the rest in the GitHub link. Below you have the ConsumerAuthorCreated and the pattern is similar for all consumers, we start by creating the author and checking if there were any books created before receiving this event.
Readings
Level Up Coding
Thanks for being a part of our community! Before you go:
- 👏 Clap for the story and follow the author 👉
- 📰 View more content in the Level Up Coding publication
- 🔔 Follow us: Twitter | LinkedIn | Newsletter
🚀👉 Join the Level Up talent collective and find an amazing job