In the fourth part of my event sourcing series, we’ll take a look at why long event streams – streams with lots of events – are a problem and what options there are to deal with this problem. Why read models are not always enough to solve performance issues due to long event streams. And of course, I’ll discuss the involved trade-offs.
The problem of long event streams
When we project an event stream with a couple of events (or probably even hundreds), this is very quick. However, when event streams grow longer and reach several thousand events, the projection begins to take quite some time and resources: CPU load and data transfer from storage.
The number of events that are acceptable to project depends on many things:
- The still acceptable request time
- The compute power to make the projection
- The database query speed and I/O throughput
- The projection-complexity: how fast are the individual event applications (updating the current value with the next event)
But read models solve this already, don’t they?
When you need fast queries and can live with slow commands, a read model is probably a good enough solution. The read model makes queries faster because no events need to be projected when querying data.
However, commands should always read and project the events, not query the read model. They should take their knowledge from the single source of truth – the events. This prevents problems with concurrency (e.g., the read model not yet updated) or when a bug results in incorrect read model data. Yeah, that shouldn’t happen, but believe me, it does.
So the longer the event streams, the longer the commands take to execute. And updating the read model takes longer as well when you go with an eventually consistent approach (it needs to reproject events). Again, you should act only when response times are a problem. Therefore, measure the response times in your system.
If you have to change a read model and need to reproject all events, this might heavily affect the migration duration.
Finally, debugging a problem in a long event stream can quickly become overwhelming.
How to prevent long event streams?
There are several options to prevent long event streams:
- Snapshots (inside or outside of the event stream)
- Stream compaction
- Split the thing you project
- Absolute Events (special case for special situations)
Snapshots
Instead of projecting the entire event stream, we introduce snapshots periodically that capture the projection state at that point in the stream. When we query the events, we get the latest snapshot and all later events, and project only these.

A snapshot can either be a special event that is also part of the event stream, or it can be stored in its own place. Choose whatever is simpler with your storage choice.
Snapshots provide very good performance because you can cut the stream length to whatever length works for you by inserting snapshots more or less frequently. They are conceptually also rather simple, and libraries typically support them well (I’m not an Event Sourcing library expert, don’t trust me blindly on this!)
The disadvantage of snapshots is that as the domain model evolves, they may need to be adapted, or you may need to support multiple versions of snapshot events. We’ll come back to versioning later in this series, but for now, I’ll just note that maintaining a larger number of versions can become cumbersome and error-prone.
Stream compaction
When compacting an event stream to reduce the number of events, we replace many old events with fewer, more meaningful ones. For example, instead of 10 000 ItemAdded events → one InventorySet event.

Whether this approach is applicable depends on the kind of events you have.
Furthermore, we change the event stream completely – no more “append only of immutable events”. If this is important to you, then this approach is not for you.
As a side effect, we replace a fine-grained history with a coarser one. Again, consider whether this is important for you.
Split the thing you project
This is the best option if you can split the thing* into meaningful parts. This makes obviously only sense when you don’t have to read all the split out events all the time when you need this thing.
So, I suggest that you review your use cases, queries, and views to understand which data they use and whether you can identify a subset that maps well to a dedicated event stream. Another idea is to split events into buckets, for example, events per year, semester, or season. Maybe registration events can be bucketised by the year they are happening, and we query only a year at once. Get creative with this approach, but always think from the domain via your model to the event stream. The domain should always be the driver.**

Of course, splitting an already existing thing with many already existing events needs a migration.
* No, I won’t call it an Aggregate. IMHO a very bad choice for a name coming from Domain Driven Design. An Aggregate in DDD should be called something like a Consistency-Boundary-Providing-Thing 😊. And more so, DDD has nothing to do with Event Sourcing, so why use a name from DDD anyway?
** This doesn’t mean that your models and event streams have to match the real-world domain one-to-one perfectly. The models and event streams should primarily make your business logic code simple and easy to maintain. Being conceptually close to the real-world domain helps communication with domain experts, customers, and users. That is nice, but IMHO source code needs a mapping anyway, and should support performant, easy to maintain code foremost (Domain Driven Design and Object Oriented Programming are full of good ideas, but don’t overdo it 😅).
Absolute Events – a niche solution
Absolute events – a name I invented for this post – are special events that set the state of the projection absolutely. So, all that is needed is the last event in an event stream of absolute events.
Snapshot events are a special case of absolute events because you don’t need events before them.
We are lucky that the longest event streams in our systems – they span several years of multiple events per day – are streams of absolute events. In our case, these are even bi-temporal event streams, which largely preclude the use of read models and snapshots. You already know, more on bi-temporal event sourcing in a later post (or search for it on this blog 😉 – but be warned).
Summary
Long event streams can become a performance issue. If so, there are several options, such as snapshots or splitting, that should help restore a snappy application.
If you hoped for a deep dive today, I must disappoint you. I have to prepare some lectures at the university this week as well.