SQS— Introduction to FIFO Queues


Simple Queue Service offers an easy interface to make use of message queueing — where you can store messages to be later processed by your logic, commonly used with microservices or distributed systems (a system that is spread across multiple nodes/computers).

What’s a first-in-first-out queue?
A first-in-first-out queue is somewhat equivalent to a queue at a shop — the first message that makes it to the queue is the first message that is pushed to the consumer, as shown in the example below.


The most important attribute that we’ll focus on in this article is the MessageGroupId required attribute, the attribute is the backbone of how fifo queues handle ordering in AWS.

It’s used to communicate to the queue on which ‘partition’ you’d like to enqueue the message, as shown below:


The order of messages is maintained within every message group (partition), not across multiple message groups — meaning that if you have multiple users carrying out actions, ideally you want the message group to be something on the lines of user_<user_id> so actions from a particular user are grouped and processed in the order they happen.

How can I have multiple consumers reading from the same queue?
In the example below, we’ll go over how AWS handles maintaining the order of messages whilst having multiple consumers reading from the same queue.


Taking a look at the diagram above, we’re seeing:

  • Groups of customers — equivalent to messages grouped by a MessageGroupId (Group 1, Group 2, Group 3)
  • Shop — equivalent to a fifo queue
  • Multiple employees — equivalent to multiple consumers reading from the same queue (commonly referred to as competing consumers)

Scenario 1
We only have messages in Group 1
The first message from Group 1 is picked up by one of the consumers and that message group is locked (other messages can’t be sent out to the consumers) until that first message is acknowledged.

Scenario 2
We have messages in all message groups
The consumers will pick messages from any message group but the order within every message group is maintained through the locking mechanism described in Scenario 1; this is where one needs to pay close attention as to how the messages are grouped in order to promote interleaving.

Scenario 3
We have an issue with processing a message from Group 1
The unacknowledged message blocks the entire message group, until the message is handled either through the visibility timeout expiring and the message re-sent, or the max retries is reached and the message is sent to the dead-letter queue.

How can I promote interleaving?
It all depends on the data model; although if we take a simple example of a data structure:


Making the assumption that we care about the order of events on the vehicles we can take two routes:

Grouping by dealer group
This would mean that a dealer group can only have one consumer at a time — since a message group locks to maintain order as explained in Scenario 1.

This would result in a backlog of events and poor performance.

Grouping by dealer
This would mean that every dealer can have its own consumer, which would lead to better performance.

One can try and go lower in the data structure to gain better performance — but in a nutshell, the less contagious your data is, the more likely you are to have a great outcome in terms of performance (better processing)


Configuration Overview


What’s visibility timeout used for?
The amount of time you want SQS to wait before re-sending the same (unacknowledged) message.

I would recommend that you profile (calculate) how long it takes for your logic to process a single message, and add reasonable padding — which would guarantee that SQS won’t send out the same message whilst you’re still processing it.

A more robust solution would be to have a ‘heartbeat’ — where you extend the visibility timeout of a message whilst processing. (Examples: Python / JavaScript)

What’s the delivery delay setting used for?
The amount of time you want SQS to wait before making a new message available.

A delay of five seconds would mean that once you add a message to a queue, that particular message cannot be retrieved by any of your consumers until that delay has expired.

What’s the receive message wait time used for?

  • Receive message wait time is set to 0
    A request is sent to the servers and a query is executed; a response is returned to the client (with or without results) — referred to as short polling .
  • Receive message wait time is set to larger than 0
    A request is sent to the servers and the server looks for results for the specified amount of time, once the time expires the results (if any) are returned — referred to as long polling.

What’s the message retention period used for?
The amount of time you want SQS to retain messages for — any messages older than the time specified will be deleted.

What’s a dead-letter queue?


A dead-letter queue refers to a queue that is used to store messages that are not acknowledged or processed successfully.

How are messages acknowledged?
A received message is not automatically acknowledged in SQS — one has to explicitly delete the message (or move it to another queue — such as a dead-letter queue) for it to be acknowledged and not re-sent once the visibility timeout expires.