In the last post, I introduced the MultiProducerConcurrentConsumer component. In this post, I’m going to walk through the constructor of the component and explain the preallocation trade-off that was made.
With almost every code that is executed on the hot path of your library, framework or system you need to make tradeoffs regarding allocations. In the end, it boils down to roughly two choices:
- Allocate as much as you need upfront
- Allocate on-demand when you need it
The benefit of only allocating when needed is that the memory consumption only grows when it is needed. The downside of this approach is that under highly concurrent scenarios allocating structures in a safe and lock-free way can be tricky.
The drawback of allocating upfront is that even though your application might never need it, the memory is preallocated and will stay that way. On the other hand, if we allocate the potentially needed structures upfront we never need to allocate during concurrent execution. Unfortunately preallocating upfront only works when you know the boundary conditions of the problem at hand.
With the MPCC structure, we can make a few assumptions which allow us to pre-allocate internally needed structures.
public MultiProducerConcurrentCompletion( int batchSize, TimeSpan pushInterval, int maxConcurrency, int numberOfSlots) { this.maxConcurrency = maxConcurrency; this.pushInterval = pushInterval; this.batchSize = batchSize; this.numberOfSlots = numberOfSlots; queues = new ConcurrentQueue<TItem>[numberOfSlots]; for (var i = 0; i < numberOfSlots; i++) { queues[i] = new ConcurrentQueue<TItem>(); } var maxNumberOfConcurrentOperationsPossible = numberOfSlots*maxConcurrency; pushTasks = new List<Task>(maxNumberOfConcurrentOperationsPossible); itemListBuffer = new ConcurrentQueue<List<TItem>>(); for (var i = 0; i < maxNumberOfConcurrentOperationsPossible; i++) { itemListBuffer.Enqueue(new List<TItem>(batchSize)); } }
Since we know the number of slots up front and we defined the component needs FIFO semantics we can use an array of ConcurrentQueue<TItem> (Line 9) and create all the queues for all the slots ahead of time (Line 10-13).
The maximum concurrency is defined per slot. So we can calculate the maximum number of concurrent operations that are possible at any given time to be number of slots multiplied with maximum concurrency (Line 15). As soon as we calculated maxNumberOfConcurrentOperationsPossible we can allocate a List<Task> which is going to hold concurrently happening operations for graceful shutdown purposes (Line 16).
Last but not least we make the following tradeoff. During runtime, the component will get many items pushed to it and complete items in batches. If we would allocate a List<TItem> everytime the pump callback is called, we’d be creating a lot of Gen0 Garbage. Namely maxNumberOfConcurrentOperationsPossible * List<TItem> at maximum per push interval. Depending on the push interval settings this could drastically impact the performance. That’s why the MPCC creates a ConcurrentQueue<List<TItem>> and enqueues List<TItem>(batchSize) into it(Line 18-22). This queue will serve as a List<TItem> buffer. Effectively we are reusing the lists during the whole lifetime of the component. You might have noticed that we initialize the list with the batch size that is also known upfront to prevent unnecessary growing of the lists.
Hope you like this so far. Stay tuned for the next post about MPCC.
[…] by /u/danielmarbach [link] […]
[…] Previous Home / .NET / MultiProducerConcurrentConsumer – Start it […]