This repository demonstrates an AWS fanout architecture built with Terraform and Python Lambdas. An HTTP API publishes messages to an SNS topic which fans out to multiple SQS queues. Each queue has a dedicated consumer Lambda that persists the message to DynamoDB only after successful processing. Failed messages are retried up to 3 times before landing in a Dead-Letter Queue (DLQ).
- SNS topic (fanout) that publishes to multiple SQS queues
- Publisher Lambda (API Gateway HTTP API) β only publishes to SNS, no persistence
- Consumer Lambdas β one per SQS queue, each saves the processed message to DynamoDB
- Dead-Letter Queues β one per main queue, receives messages after 3 failed consumer attempts
- DynamoDB table β written by consumers on success (proof of processing, not just receipt)
- CloudWatch Log Groups, Metric Filters and Alarms for SNS, SQS, DLQs and Lambda errors
- 60-second delivery delay on all queues β messages sit in SQS for 60s in
Delayedstate before consumers can pick them up; usemake pollto watch the lifecycle in real time
- Terraform >= 1.0
- AWS CLI configured with credentials and default region
- Python 3.13
- Node.js >= 18 and npm (required by Serverless Framework and its plugins)
Run
make requirementsto install all tool dependencies at once (see Makefile below).
aws_fanout_pattern/
ββ Makefile # All common tasks (see below)
ββ terraform/
β ββ main.tf # Terraform configuration (SNS, SQS, Lambda, IAM, CloudWatch)
β ββ lambda/
β β ββ publisher.py # Publisher Lambda: publishes to SNS
β β ββ sqs_logger.py # Logger Lambda: writes SQS/SNS events to CloudWatch log groups
β β ββ test_fail.py # (optional) test lambda used during debugging
β ββ outputs.json # Generated by Terraform after apply, consumed by serverless/
ββ serverless/
β ββ serverless.yml # Serverless Framework configuration
β ββ package.json # Node.js deps: serverless + serverless-python-requirements
β ββ requirements.txt # Python deps packaged into Lambda (boto3)
β ββ handlers/
β β ββ publisher.py # HTTP API Lambda handler (publishes to SNS)
β β ββ consumer.py # SQS consumer Lambda handler (saves to DynamoDB)
β ββ publish.py # Local script to publish a test message to SNS
β ββ receive_sqs.py # Local script to poll SQS manually
ββ README.md
All common operations are available as make targets. Run make help to see them:
Usage: make <target> [REGION=<aws-region>]
REGION AWS region to deploy to (default: eu-west-1)
Targets:
requirements Install all local dependencies (npm + pip + serverless)
terraform-init Initialize Terraform providers
terraform-apply Deploy infrastructure (SNS, SQS, IAM, CloudWatch)
serverless-deploy Deploy Lambda functions via Serverless Framework
publish Publish a test message to SNS
poll Watch all 3 SQS queues every 10s (Delayed/InFlight/Visible) β Ctrl+C to stop
destroy Remove ALL resources (Serverless + Terraform)
Examples:
make terraform-apply REGION=eu-west-1
make serverless-deploy REGION=eu-west-1
make publish && make poll
| Target | What it does |
|---|---|
make requirements |
Creates a Python venv at serverless/.venv/, installs Python deps, Serverless Framework v3 globally, and the serverless-python-requirements plugin |
make terraform-init |
Runs terraform init inside terraform/ |
make terraform-apply [REGION=...] |
Runs terraform apply with aws_region and writes outputs.json |
make serverless-deploy [REGION=...] |
Runs npm install + sls deploy --region inside serverless/ |
make publish [REGION=...] |
Publishes a test message to SNS using the venv Python |
make poll [REGION=...] |
Watches all 3 SQS queues every 10s showing Delayed / InFlight / Visible counts β runs until Ctrl+C |
make destroy [REGION=...] |
Removes all resources: Serverless stack first, then Terraform infrastructure |
All deployment targets accept a REGION parameter. If not specified, eu-west-1 is used by default.
# Deploy to Ireland (default)
make terraform-apply REGION=eu-west-1
make serverless-deploy REGION=eu-west-1
# Deploy to SΓ£o Paulo
make terraform-apply REGION=sa-east-1
make serverless-deploy REGION=sa-east-1
# Deploy to Singapore
make terraform-apply REGION=ap-southeast-1
make serverless-deploy REGION=ap-southeast-1Always use the same
REGIONforterraform-applyandserverless-deployso both point to the same region.
make requirementsThis installs:
- A Python virtual environment at
serverless/.venv/(avoids PEP 668 system-package conflicts) boto3and any other Python deps into the venv (fromserverless/requirements.txt)serverless@3globally via npmserverless-python-requirementsplugin (fromserverless/package.json)
make terraform-init
make terraform-apply REGION=eu-west-1This provisions SNS, SQS queues, DLQs, DynamoDB, IAM roles, and CloudWatch Log Groups, and writes terraform/outputs.json which is consumed by the Serverless deployment.
Note: The Terraform configuration uses a
local-execprovisioner to set SNS topic attributes via the AWS CLI. Ensure the AWS CLI is configured with the correct credentials and region.
make serverless-deploy REGION=eu-west-1Packages and deploys the publisher and consumer Lambda functions using the Serverless Framework.
Open two terminals:
Terminal 1 β start the watcher before publishing:
make pollTerminal 2 β publish the message:
make publishmake poll refreshes every 10 seconds and shows the full message lifecycle:
βββββββββββββββββββββββββββββββββββββββ 14:32:00
Queue 1 β Delayed: 1 | InFlight: 0 | Visible: 0 β message just arrived
Queue 2 β Delayed: 1 | InFlight: 0 | Visible: 0
Queue 3 β Delayed: 1 | InFlight: 0 | Visible: 0
βββββββββββββββββββββββββββββββββββββββ 14:33:00 β after 60s delay
Queue 1 β Delayed: 0 | InFlight: 1 | Visible: 0 β Lambda processing
Queue 2 β Delayed: 0 | InFlight: 1 | Visible: 0
Queue 3 β Delayed: 0 | InFlight: 1 | Visible: 0
βββββββββββββββββββββββββββββββββββββββ 14:33:10
Queue 1 β Delayed: 0 | InFlight: 0 | Visible: 0 β processed β
Queue 2 β Delayed: 0 | InFlight: 0 | Visible: 0
Queue 3 β Delayed: 0 | InFlight: 0 | Visible: 0
Press Ctrl+C to stop polling. Then verify the 3 DynamoDB records were written:
TABLE=$(jq -r .dynamodb_table_name terraform/outputs.json)
aws dynamodb scan \
--region eu-west-1 \
--table-name $TABLE \
--query "Items[*].{consumer:consumer_id.S, body:body.S, at:processed_at.N}" \
--output tableThe goal is to verify that one message published = one DynamoDB record per consumer (3 records total), and that a consumer failure routes the message to its DLQ after 3 retries.
All commands below read from terraform/outputs.json β make sure Terraform has been applied first.
# Convenience variables β run these once in your shell
REGION=eu-west-1
API_ENDPOINT=$(jq -r .api_endpoint terraform/outputs.json)
SNS_ARN=$(jq -r .sns_topic_arn terraform/outputs.json)
TABLE=$(jq -r .dynamodb_table_name terraform/outputs.json)
QUEUE_1_URL=$(jq -r '.queues[0].url' terraform/outputs.json)
QUEUE_2_URL=$(jq -r '.queues[1].url' terraform/outputs.json)
QUEUE_3_URL=$(jq -r '.queues[2].url' terraform/outputs.json)
DLQ_1_URL=$(jq -r '.dlqs[0].url' terraform/outputs.json)The queues have a 60-second delivery delay: messages arrive in SQS immediately in Delayed state and consumers cannot pick them up for 60 seconds. make poll watches this lifecycle automatically every 10 seconds.
Terminal 1:
make pollTerminal 2:
# Publish via Makefile (uses publish.py + SNS ARN from outputs.json)
make publish
# Or publish via the HTTP API directly
curl -s -X POST "$API_ENDPOINT/publish" \
-H 'Content-Type: application/json' \
-d '{"text": "fanout-test-1"}' \
-w '\nHTTP_STATUS:%{http_code}\n'Watch Terminal 1 progress through three phases:
| Phase | Delayed | InFlight | Visible | Meaning |
|---|---|---|---|---|
| 0β60s | 1 | 0 | 0 | Message held in delay window |
| ~60s | 0 | 1 | 0 | Consumer Lambda executing |
| done | 0 | 0 | 0 | Processed and saved to DynamoDB |
After polling completes, verify 3 DynamoDB records β one per consumer:
aws dynamodb scan \
--region $REGION \
--table-name $TABLE \
--filter-expression "contains(body, :v)" \
--expression-attribute-values '{":v": {"S": "fanout-test-1"}}' \
--query "Items[*].{consumer:consumer_id.S, message_id:message_id.S, at:processed_at.N}" \
--output tableExpected: 3 rows, one for each consumer_1, consumer_2, consumer_3.
Useful to test the fan-out in isolation, without API Gateway involved.
aws sns publish \
--region $REGION \
--topic-arn $SNS_ARN \
--message '{"text": "fanout-test-direct"}'Then repeat the DynamoDB scan from Test 1 with fanout-test-direct. Expect 3 new records.
After a successful fan-out all messages should have been consumed. Check that no messages are stuck in the queues.
for URL in $QUEUE_1_URL $QUEUE_2_URL $QUEUE_3_URL; do
aws sqs get-queue-attributes \
--region $REGION \
--queue-url "$URL" \
--attribute-names ApproximateNumberOfMessages ApproximateNumberOfMessagesNotVisible ApproximateNumberOfMessagesDelayed \
--query "Attributes" \
--output table
doneExpected: all three counters at 0 on every queue.
To confirm messages route to the DLQ after 3 failed processing attempts, temporarily break consumer_1 by setting an invalid DDB_TABLE_NAME via the AWS console or CLI, then publish a message.
# 1. Check DLQ is empty before the test
aws sqs get-queue-attributes \
--region $REGION \
--queue-url $DLQ_1_URL \
--attribute-names ApproximateNumberOfMessages \
--query "Attributes.ApproximateNumberOfMessages"
# 2. Publish a message (with consumer_1 broken, it will fail 3 times)
aws sns publish \
--region $REGION \
--topic-arn $SNS_ARN \
--message '{"text": "dlq-test"}'
# 3. Wait ~90 seconds (3 retries Γ 30s visibility timeout), then check DLQ
aws sqs get-queue-attributes \
--region $REGION \
--queue-url $DLQ_1_URL \
--attribute-names ApproximateNumberOfMessages \
--query "Attributes.ApproximateNumberOfMessages"Expected: DLQ count goes from "0" to "1". consumer_2 and consumer_3 still succeed and write to DynamoDB normally.
# 4. Inspect the message in the DLQ (without deleting it)
aws sqs receive-message \
--region $REGION \
--queue-url $DLQ_1_URL \
--max-number-of-messages 1 \
--visibility-timeout 0 \
--query "Messages[0].Body"After Test 4, the DLQ alarm should have triggered.
aws cloudwatch describe-alarms \
--region $REGION \
--alarm-name-prefix "dlq-messages-visible" \
--query "MetricAlarms[*].{Name:AlarmName, State:StateValue, Reason:StateReason}" \
--output tableExpected: at least one alarm in ALARM state for the queue that had the failing consumer.
make destroy REGION=eu-west-1- In production you might want to:
- Give each Lambda its own execution role with least-privilege permissions.
- Use structured JSON logs so CloudWatch metric filters are more precise.
- Configure SNS subscription confirmations and security settings for your environment.
- Remove or reduce the
delay_secondson the SQS queues (set to 60s here for educational purposes only).
