AWS SQS and SNS: Messaging and Event-Driven Architecture (2026)
Amazon SQS (Simple Queue Service) and Amazon SNS (Simple Notification Service) are the backbone of event-driven architectures on AWS. SQS decouples producers from consumers via durable queues; SNS delivers messages to multiple subscribers simultaneously. Used together, they enable fanout patterns, reliable async processing, and loosely coupled microservices. This guide covers both services in depth — including the gotchas that trip up developers in production.
Table of Contents
SQS Standard vs FIFO
Choosing the wrong queue type is the most common SQS mistake. Standard queues are nearly unlimited in throughput but offer at-least-once delivery and best-effort ordering. FIFO queues guarantee exactly-once processing and strict order, but cap at 300 transactions/second (3,000 with batching).
| Feature | Standard Queue | FIFO Queue |
|---|---|---|
| Throughput | Unlimited | 300 TPS (3,000 with batching) |
| Message ordering | Best-effort | Strict FIFO per message group |
| Delivery guarantee | At least once (duplicates possible) | Exactly once |
| Deduplication | Not built-in | Built-in (5-minute dedup window) |
| Price (per million) | $0.40 | $0.50 |
| Use case | High-throughput async tasks | Financial transactions, order processing |
# Create a FIFO queue
aws sqs create-queue \
--queue-name OrderProcessing.fifo \
--attributes '{
"FifoQueue": "true",
"ContentBasedDeduplication": "false",
"VisibilityTimeout": "60",
"MessageRetentionPeriod": "86400"
}'
# Send a message to FIFO queue (MessageGroupId required)
aws sqs send-message \
--queue-url https://sqs.us-east-1.amazonaws.com/123456789012/OrderProcessing.fifo \
--message-body '{"orderId": "ord-123", "action": "process"}' \
--message-group-id "customer-cust-456" \
--message-deduplication-id "ord-123-process"
MessageGroupId determines ordering scope. Messages with the same group ID are processed in strict order. Use customer ID or tenant ID as the group ID so orders from one customer don't block others.
Visibility Timeout and Long Polling
When a consumer receives a message, SQS makes it invisible to other consumers for the visibility timeout period. If the consumer doesn't delete the message before the timeout expires, SQS makes it visible again — another consumer can pick it up. This is how SQS provides fault tolerance without message loss.
import boto3
import json
sqs = boto3.client('sqs', region_name='us-east-1')
QUEUE_URL = 'https://sqs.us-east-1.amazonaws.com/123456789012/MyQueue'
# Long polling: wait up to 20 seconds for messages (reduces empty responses)
response = sqs.receive_message(
QueueUrl=QUEUE_URL,
MaxNumberOfMessages=10, # batch up to 10
WaitTimeSeconds=20, # long poll
VisibilityTimeout=60, # give processor 60s to finish
AttributeNames=['All'],
MessageAttributeNames=['All']
)
for message in response.get('Messages', []):
try:
body = json.loads(message['Body'])
process_message(body)
# Extend visibility if processing takes longer
sqs.change_message_visibility(
QueueUrl=QUEUE_URL,
ReceiptHandle=message['ReceiptHandle'],
VisibilityTimeout=120
)
# Delete only after successful processing
sqs.delete_message(
QueueUrl=QUEUE_URL,
ReceiptHandle=message['ReceiptHandle']
)
except Exception as e:
print(f"Failed to process: {e}")
# Don't delete — message returns to queue after visibility timeout
Set visibility timeout to at least 6x your average processing time to avoid duplicate processing from timeout races.
Dead-Letter Queues
A Dead-Letter Queue (DLQ) is a separate SQS queue that receives messages that fail processing after a configurable number of attempts (maxReceiveCount). DLQs are essential for production — without them, poison-pill messages loop forever and block queue processing.
# Step 1: Create the DLQ
aws sqs create-queue --queue-name OrderProcessing-DLQ
# Get DLQ ARN
DLQ_ARN=$(aws sqs get-queue-attributes \
--queue-url https://sqs.us-east-1.amazonaws.com/123456789012/OrderProcessing-DLQ \
--attribute-names QueueArn \
--query 'Attributes.QueueArn' --output text)
# Step 2: Set redrive policy on the main queue
aws sqs set-queue-attributes \
--queue-url https://sqs.us-east-1.amazonaws.com/123456789012/OrderProcessing \
--attributes "{
\"RedrivePolicy\": \"{\\\"deadLetterTargetArn\\\":\\\"${DLQ_ARN}\\\",\\\"maxReceiveCount\\\":\\\"3\\\"}\"
}"
After 3 failed receive-and-not-delete cycles, messages move to the DLQ. Set up a CloudWatch alarm on ApproximateNumberOfMessagesVisible for the DLQ to alert on-call when poisoned messages arrive.
SQS with Lambda Triggers
Lambda can poll SQS queues automatically via an Event Source Mapping. Lambda scales the number of pollers based on queue depth — up to the Lambda concurrency limit.
# Create Lambda event source mapping
aws lambda create-event-source-mapping \
--function-name ProcessOrder \
--event-source-arn arn:aws:sqs:us-east-1:123456789012:OrderProcessing \
--batch-size 10 \
--maximum-batching-window-in-seconds 5 \
--function-response-types ReportBatchItemFailures
# Lambda handler with partial batch failure reporting
import json
def handler(event, context):
failed_items = []
for record in event['Records']:
try:
body = json.loads(record['body'])
process_order(body)
except Exception as e:
print(f"Failed: {record['messageId']}: {e}")
failed_items.append({'itemIdentifier': record['messageId']})
# Only failed items return to queue; successful ones are deleted
return {'batchItemFailures': failed_items}
ReportBatchItemFailures. Without it, if any message in a batch fails, the entire batch is retried — including messages you already processed successfully, causing duplicates.
SNS Topics and Subscriptions
SNS topics receive published messages and fan them out to all subscribers. Supported subscription protocols: SQS, Lambda, HTTP/HTTPS, email, SMS, and mobile push.
# Create a standard SNS topic
aws sns create-topic --name OrderEvents
# Subscribe an SQS queue to the topic
aws sns subscribe \
--topic-arn arn:aws:sns:us-east-1:123456789012:OrderEvents \
--protocol sqs \
--notification-endpoint arn:aws:sqs:us-east-1:123456789012:OrderProcessing
# Allow SNS to send to the SQS queue (queue policy must permit this)
aws sqs set-queue-attributes \
--queue-url https://sqs.us-east-1.amazonaws.com/123456789012/OrderProcessing \
--attributes '{
"Policy": "{\"Version\":\"2012-10-17\",\"Statement\":[{\"Effect\":\"Allow\",\"Principal\":{\"Service\":\"sns.amazonaws.com\"},\"Action\":\"sqs:SendMessage\",\"Resource\":\"arn:aws:sqs:us-east-1:123456789012:OrderProcessing\",\"Condition\":{\"ArnEquals\":{\"aws:SourceArn\":\"arn:aws:sns:us-east-1:123456789012:OrderEvents\"}}}]}"
}'
# Publish a message
aws sns publish \
--topic-arn arn:aws:sns:us-east-1:123456789012:OrderEvents \
--message '{"orderId": "ord-123", "event": "created"}' \
--message-attributes '{"eventType": {"DataType": "String", "StringValue": "ORDER_CREATED"}}'
SNS Fanout Pattern
The SNS+SQS fanout pattern solves a common problem: one event must trigger multiple independent downstream processes. Instead of the producer calling each consumer, it publishes to SNS once, and SNS delivers to multiple SQS queues in parallel.
# Architecture: OrderCreated event fans out to 3 queues
# SNS Topic: OrderEvents
# → SQS: EmailNotificationQueue (send confirmation email)
# → SQS: InventoryUpdateQueue (decrement stock)
# → SQS: AnalyticsQueue (record in data warehouse)
# Subscribe each queue to the same topic
for queue_arn in email_queue_arn inventory_queue_arn analytics_queue_arn; do
aws sns subscribe \
--topic-arn arn:aws:sns:us-east-1:123456789012:OrderEvents \
--protocol sqs \
--notification-endpoint $queue_arn
done
Benefits: producer doesn't know about consumers, consumers scale independently, adding a new consumer requires zero producer changes, each queue has its own retry and DLQ policy.
SNS Message Filtering
Without filtering, every subscriber receives every message. SNS filter policies let subscribers receive only the messages they care about, based on message attributes.
# Add a filter policy to a subscription
# This subscription only receives ORDER_SHIPPED events
aws sns set-subscription-attributes \
--subscription-arn arn:aws:sns:us-east-1:123456789012:OrderEvents:sub-abc123 \
--attribute-name FilterPolicy \
--attribute-value '{
"eventType": ["ORDER_SHIPPED", "ORDER_DELIVERED"],
"priority": [{"numeric": [">=", 5]}]
}'
# Or use filter policy scope for message body filtering (newer feature)
aws sns set-subscription-attributes \
--subscription-arn arn:aws:sns:us-east-1:123456789012:OrderEvents:sub-abc123 \
--attribute-name FilterPolicyScope \
--attribute-value MessageBody
Message Deduplication
Standard SQS queues can deliver duplicates. Handle this in your consumer with idempotency checks rather than relying on the queue. A common pattern uses DynamoDB to track processed message IDs:
import boto3
from botocore.exceptions import ClientError
dynamodb = boto3.resource('dynamodb')
processed_table = dynamodb.Table('ProcessedMessages')
def process_with_idempotency(message_id: str, payload: dict):
# Attempt to write the message ID — fails if already exists
try:
processed_table.put_item(
Item={
'messageId': message_id,
'processedAt': int(time.time()),
'expiresAt': int(time.time()) + 86400 # TTL: 24h
},
ConditionExpression='attribute_not_exists(messageId)'
)
except ClientError as e:
if e.response['Error']['Code'] == 'ConditionalCheckFailedException':
print(f"Duplicate message {message_id} — skipping")
return
raise
# Only reached if this is the first time seeing this message
do_actual_processing(payload)
Architecture Patterns
Three production-proven patterns combining SQS and SNS:
Pattern 1: Request-Response via SQS — Producer sends to a request queue with a replyToQueue attribute pointing to a response queue. Consumer processes and sends result to the reply queue. Useful for async RPC-style communication.
Pattern 2: Priority Queue — Use two SQS queues (high-priority, low-priority). Consumers poll high-priority first; fall back to low-priority only when high-priority is empty. Lambda concurrency reservations can enforce capacity allocation.
Pattern 3: Competing Consumers with Auto Scaling — Use SQS queue depth (ApproximateNumberOfMessages) as a CloudWatch metric to scale ECS or EC2 consumers. Scale out when queue grows, scale in when drained. See the ECS guide for service auto scaling configuration.
Frequently Asked Questions
256 KB per message. For larger payloads, use the SQS Extended Client Library (Java/Python) which stores the payload in S3 and puts a pointer in the SQS message. Alternatively, store the large object in S3 and send just the S3 key in the SQS message.
Up to 14 days (the maximum retention period). The default is 4 days. Messages not consumed within the retention period are permanently deleted. Set up CloudWatch alarms on ApproximateAgeOfOldestMessage to detect stuck consumers before messages expire.
SNS attempts delivery with retries (up to 23 retries over several hours for HTTP endpoints). For SQS subscribers, delivery is highly reliable since SQS is durable. For Lambda subscribers, SNS retries twice on failure. For critical messages, always use SNS → SQS → Lambda rather than SNS → Lambda directly, so you get the SQS durability buffer.
Yes. Spring Cloud AWS provides @SqsListener annotation-based consumer support. Add spring-cloud-aws-starter-sqs to your dependencies and annotate a method with @SqsListener("queue-name"). It handles polling, deserialization, and acknowledgment automatically.
SQS is a fully managed, serverless queue — messages are deleted after consumption, no consumer group offset management, no replay by default. Kafka retains messages for configurable periods and supports replay, multiple consumer groups reading the same data independently, and higher throughput at scale. For AWS-native workloads without replay requirements, SQS is simpler. For complex streaming pipelines, consider Amazon MSK (managed Kafka).