Azure Service Bus .NET SDK Deep Dive – Deduplication

Explains how de-duplication can help to make sure a message is only delivered once within a certain time period, for more posts in this series go to Contents.

In the posts atomic sends and send via I showed ways to achieve atomicity for a group of messages or messages that are sent out as part of an outgoing message. Let’s have a look at a piece of code that looks super simply but might always behave like you’d wish it should behave:

public async Task<IActionResult> SubmitOrder(Order order) {
   await messageSender.SendAsync(new Message());
   await context.SaveChangesAsync();
}

In the above pseudo controller code we are sending out a message to Azure Service Bus and then store some state into the database context. If saving to the database fails the message will be sent to the destination but the client might retry submitting the same order again (circuit breaker). Unfortunately it is not possible to enlist the message send in the database context without going down the path of implementing an outbox pattern. The SDK prevents mixing multiple resource managers in the same ambient transaction and will throw a TransactionPromotionException.

There is another way thought. If it is possible to derive a unique business identifier from the operation that remains stable across retries such as the order ID it would be possible to use that ID and promote it as the ID of the message.

public async Task<IActionResult> SubmitOrder(Order order) {
   await messageSender.SendAsync(new Message() { 
      MessageId = order.OrderId 
   });
   await context.SaveChangesAsync();
}

So by making the application in charge of setting the message ID and enabling message de-duplication detection on the queue or topic messages are delivered to multiple sends of the same message with the same identifier will automatically get de-duplicated by Azure Service Bus.

var client = new ManagementClient(connectionString);
var queueDescription = new QueueDescription(destination)
{
    RequiresDuplicateDetection = true,
    DuplicateDetectionHistoryTimeWindow = TimeSpan.FromSeconds(20)
};
await client.CreateQueueAsync(queueDescription);

The duplicate detection time history defaults to 30 seconds for queues and topics, with a maximum value of seven days.

var client = new QueueClient(connectionString, destination);
var content = Encoding.UTF8.GetBytes("Message1Message1");
var messageId = new Guid(content).ToString();

var messages = new List<Message>
{
    new Message(content) { MessageId = messageId },
    new Message(content) { MessageId = messageId },
    new Message(content) { MessageId = messageId }
};

await client.SendAsync(messages);

Let’s see it in action.

Enabling duplicate detection and the size of the window directly impact the queue (and topic) throughput, since all recorded message-ids must be matched against the newly submitted message identifier. Keeping the window small means that fewer message-ids must be retained and matched, and throughput is impacted less. For high throughput entities that require duplicate detection, you should keep the window as small as possible.

About the author

Daniel Marbach

Add comment

Recent Posts