Enterprise-grade architecture combining AWS Route 53 advanced features, ECS Fargate containerization, AI-powered threat detection via Amazon Bedrock, and automated remediation through Lambda. Designed for high availability, security, and intelligent operations.
Key Features:
- ✅ Multi-region active-passive failover (Route 53)
- ✅ Hybrid DNS (on-premises + AWS integration)
- ✅ DNS-level threat protection (DNS Firewall)
- ✅ Real-time log aggregation (CloudWatch)
- ✅ Long-term data lake (S3 + lifecycle management)
- ✅ AI-powered threat detection (Amazon Bedrock)
- ✅ Automated security response (Lambda remediation)
- ✅ Complete audit trail (DynamoDB + CloudWatch)
┌──────────────────────────────────────────────────────────────────────────────┐
│ INTERNET USERS │
│ (Global Customer Traffic) │
└────────────────────────────────────┬─────────────────────────────────────────┘
│
╔════════════════╩════════════════╗
║ ║
┌───────▼────────┐ ┌────────────▼──────────┐
│ Route 53 │ │ Route 53 │
│ Global DNS │ │ Global DNS │
│ (Anycast) │ │ (Anycast) │
└────────┬───────┘ └────────┬───────────────┘
│ │
┌────────────┴────────────┐ ┌──────────┴────────────┐
│ │ │ │
┌───▼──────────────┐ ┌────────▼──────────────────────┐
│ US-EAST-1 │ │ US-WEST-2 │
│ (PRIMARY) │ │ (DR/STANDBY) │
└───┬──────────────┘ └────┬──────────────────────────┘
│ │
┌───┴──────────────────────┴──────────────────────────┐
│ HYBRID DNS LAYER (Route 53 Advanced) │
├────────────────────────────────────────────────────┤
│ ✓ Route 53 Profiles (Geolocation routing) │
│ ✓ Route 53 Resolver (On-prem hybrid DNS) │
│ ✓ Private Hosted Zones (Internal DNS) │
│ ✓ DNS Firewall (Threat blocking) │
│ ✓ Query Logging (CloudWatch integration) │
└───┬──────────────────────────────────────────────────┘
│
├──────────────────────┬──────────────────────┐
│ │ │
┌───▼────────┐ ┌────▼────────┐ ┌───▼─────────┐
│ ALB │ │ ALB │ │ Route 53 │
│ (Public) │ │ (Public) │ │ Resolver │
└───┬────────┘ └────┬────────┘ └─────────────┘
│ │
┌───┴──────────────────────┴────────────────────────┐
│ ECS FARGATE CLUSTER (Container Workloads) │
├────────────────────────────────────────────────────┤
│ • API Service (Node.js) │
│ • Web Service (Nginx) │
│ • Monitoring Agent (Log forwarding) │
│ • CloudWatch Agent (Metrics collection) │
└───┬──────────────────────────────────────────────────┘
│
│ (Logs → CloudWatch)
│
┌───▼────────────────────────────────────────────────┐
│ CloudWatch Logs (Real-time, 30-day retention) │
├────────────────────────────────────────────────────┤
│ • Route 53 Query Logs │
│ • DNS Firewall Blocks │
│ • ECS Container Logs │
│ • Application Logs (JSON structured) │
└───┬──────────────────────────────────────────────────┘
│
│ (Daily export)
│
┌───▼────────────────────────────────────────────────┐
│ S3 Data Lake (Long-term archival) │
├────────────────────────────────────────────────────┤
│ • Route 53 query logs (partitioned by date) │
│ • DNS Firewall logs (partitioned by date) │
│ • Application logs (compressed, archived) │
│ • Lifecycle: Standard → Glacier → Deep Archive │
└───┬──────────────────────────────────────────────────┘
│
│ (Orchestrated analysis)
│
┌───▼────────────────────────────────────────────────┐
│ AWS Lambda: threat-detection-orchestrator │
├────────────────────────────────────────────────────┤
│ • Extract data from S3 │
│ • Prepare threat analysis prompt │
│ • Invoke Bedrock AI models │
│ • Process threat detections │
│ • Determine remediation actions │
└───┬──────────────────────────────────────────────────┘
│
│ (Send threat analysis requests)
│
┌───▼────────────────────────────────────────────────┐
│ Amazon Bedrock: Claude 3 Sonnet │
├────────────────────────────────────────────────────┤
│ AI-Powered Threat Detection: │
│ • DGA (Domain Generation Algorithm) detection │
│ • Ransomware C&C identification │
│ • DNS exfiltration pattern recognition │
│ • Anomaly detection in traffic patterns │
│ • Predictive threat assessment │
│ • Root cause analysis │
│ • Remediation recommendations │
└───┬──────────────────────────────────────────────────┘
│
│ (Threat detection results)
│
├──────────────┬──────────────┬──────────────┐
│ │ │ │
┌───▼──────┐ ┌────▼────┐ ┌─────▼───┐ ┌───────▼──────┐
│DynamoDB │ │ SNS │ │Dashboard│ │ S3 Reports │
│Tracking │ │ Topics │ │ Update │ │ (Archive) │
└──────────┘ └────┬────┘ └─────────┘ └──────────────┘
│
│ (Alert security team)
│
┌────────▼──────────┐
│ Email / Slack │
│ Security Team │
└────────┬──────────┘
│
│ (High severity threats)
│
┌──────────▼────────────┐
│ SNS → Lambda Trigger │
└──────────┬────────────┘
│
┌──────────────────▼──────────────────┐
│ AWS Lambda: Auto-Remediation │
├─────────────────────────────────────┤
│ 1. Update DNS Firewall rules │
│ 2. Modify Route 53 traffic policies │
│ 3. Update security group rules │
│ 4. Create CloudWatch alarms │
│ 5. Log all actions (audit trail) │
│ 6. Notify security team │
│ 7. Create JIRA incident ticket │
└──────────────────┬──────────────────┘
│
┌──────────────┼──────────────┐
│ │ │
┌───▼─────┐ ┌─────▼────┐ ┌────▼──────┐
│DNS Rule │ │Route53 │ │Security │
│Updated │ │Policy │ │Group │
│ │ │Adjusted │ │Updated │
└─────────┘ └──────────┘ └───────────┘
│
│ (Changes propagate)
│
THREAT MITIGATED ✅
Public Hosted Zone (example.com)
- Handles global DNS queries from internet users
- Uses anycast routing (queries routed to nearest edge)
- TTL: 60 seconds (fast failover)
- Health checks on ALBs (automatic failover)
- Weighted routing: 70% us-east-1, 30% us-west-2
Private Hosted Zones (internal.example.com)
- Only resolvable from within VPCs
- Associated with both regions
- Records: api.internal, db.internal, cache.internal, etc.
- TTL: 300 seconds (less frequent changes)
Route 53 Resolver (Hybrid DNS)
- Inbound endpoint: Accepts DNS queries from on-premises
- Outbound endpoint: Forwards VPC queries to on-premises DNS
- Enables seamless hybrid cloud environment
DNS Firewall
- Rule Groups: Malware domains, Ransomware, Cryptomining, Suspicious patterns
- Actions: BLOCK (drop query), ALERT (log + block), ALLOW
- Logging: All blocks logged to CloudWatch → S3
- Updates: Lambda can add/remove rules dynamically
Route 53 Query Logging
- Logs every DNS query (source IP, domain, response code)
- Destination: CloudWatch Logs
- Volume: Can handle 1M+ queries per second
- Retention: 30 days in CloudWatch, then archived to S3
Cluster Configuration
- Two clusters (one per region)
- Fargate launch type (serverless containers)
- VPC integration with private subnets
- Auto Scaling Groups (min 2-3, max 10-20 tasks per service)
Services
-
API Service
- Container: Node.js 18 app
- Desired count: 3-10 (auto-scales)
- Health check: /health endpoint (30s interval)
- Logs: CloudWatch (/ecs/api-service)
-
Web Service
- Container: Nginx reverse proxy
- Desired count: 2-5 (auto-scales)
- Health check: HTTP GET / (200 OK expected)
- Logs: CloudWatch (/ecs/web-service)
-
Monitoring Agent
- Lightweight container
- Collects and forwards logs
- Runs 1 per cluster (no scaling)
Log Groups
/aws/route53/queries- All DNS queries/aws/route53/firewall- DNS Firewall blocks/ecs/api-service- API application logs/ecs/web-service- Web server logs/lambda/threat-detection- Threat detection execution/lambda/auto-remediation- Remediation actions
Dashboards
- Real-time traffic metrics
- DNS query rates and error rates
- Firewall block statistics
- Application performance (latency, errors)
- Security alerts summary
Alarms
- High error rate (>1% of requests)
- High latency (p99 > 500ms)
- DNS firewall blocks (>100 per minute)
- Lambda errors (any errors = alert)
Bucket Structure
s3://dns-operations-datalake/
├── logs/
│ ├── route53/queries/year=2026/month=05/day=31/...
│ ├── route53/firewall/year=2026/month=05/day=31/...
│ ├── ecs/api/year=2026/month=05/day=31/...
│ └── ecs/web/year=2026/month=05/day=31/...
├── models/
│ ├── threat-patterns/
│ └── ml-artifacts/
├── reports/
│ ├── threat-analysis/2026-05-31-daily-report.md
│ └── incidents/INC-001-2026-05-31.json
└── archive/
Lifecycle Policies
- Standard (0-90 days): Immediate retrieval needed
- Glacier (90 days - 1 year): Infrequent access
- Deep Archive (1+ year): Compliance/legal hold
Export to S3
- Daily scheduled Lambda exports CloudWatch logs to S3
- Partitioned by date/hour for queryability
- Compressed (.gz) to save space
Model: Claude 3 Sonnet
- Optimized for security analysis
- Context: 200K tokens (can analyze massive log files)
- Temperature: 0.3 (deterministic, consistent results)
Input to Bedrock
- DNS query summary (top domains, volume, patterns)
- DNS firewall blocks (what was blocked)
- Application metrics (errors, latency)
- Traffic anomalies detected
- Historical patterns
Threat Detection Capabilities
- DGA detection (Domain Generation Algorithm)
- Ransomware C&C identification
- DNS exfiltration (data smuggling)
- Crypto-mining infrastructure
- APT activity indicators
- Anomalies in traffic patterns
Output from Bedrock
- Threat severity (Critical/High/Medium/Low)
- Confidence score (0-100%)
- Root cause analysis
- Recommended remediation actions
- False positive risk assessment
Function 1: threat-detection-orchestrator
- Trigger: CloudWatch Events (hourly)
- Timeout: 15 minutes
- Memory: 3008 MB (for processing large datasets)
Execution Steps
- Query S3 data lake (last hour of logs)
- Decompress and parse JSON logs
- Aggregate metrics (query counts, patterns, etc.)
- Format prompt for Bedrock
- Invoke Claude 3 Sonnet
- Parse response (detect threats)
- Store in DynamoDB (tracking)
- Publish to SNS (if threats detected)
Function 2: auto-remediation
- Trigger: SNS topic (threat-alerts)
- Timeout: 60 seconds
- Memory: 1024 MB
Remediation Actions
- Update DNS Firewall (add block rules)
- Modify Route 53 traffic policies
- Update ALB security groups
- Create CloudWatch alarms
- Log to DynamoDB (audit trail)
- Send SNS notifications (team alerts)
- Create JIRA ticket (incident tracking)
Table: security-incidents
- Partition Key: incident_id
- Sort Key: timestamp
- TTL: 365 days (auto-delete old incidents)
Attributes
- incident_id (unique)
- threat_id (link to threat detection)
- severity (CRITICAL, HIGH, MEDIUM, LOW)
- threat_type (DGA, ransomware, etc.)
- affected_domains (list)
- affected_ips (list)
- remediation_actions (list)
- status (pending, in-progress, resolved)
- assigned_to (on-call engineer)
- created_timestamp
- resolved_timestamp
- jira_ticket_url
- bedrock_analysis (JSON)
Defense in Depth
- DNS Firewall (DNS layer threat protection)
- ALB + WAF (HTTP layer protection)
- Security Groups (network layer)
- IAM policies (least privilege access)
Automation & Speed
- Threats detected and blocked in seconds
- No manual intervention for obvious threats
- Escalation to team for investigation
Audit Trail
- Every DNS query logged
- Every remediation action logged
- Immutable logs in S3 (compliance ready)
- CloudTrail for AWS API calls
Redundancy
- Multi-region failover (Route 53)
- Multi-AZ deployment (within regions)
- Auto Scaling (handle traffic spikes)
- DynamoDB on-demand (no capacity management)
| Component | Time | Difficulty |
|---|---|---|
| Route 53 setup | 2 hours | Medium |
| VPC & Networking | 4 hours | Medium |
| ECS Fargate | 6 hours | High |
| CloudWatch setup | 3 hours | Medium |
| S3 Data Lake | 2 hours | Low |
| Lambda functions | 8 hours | High |
| Bedrock integration | 4 hours | Medium |
| Testing & validation | 8 hours | High |
| TOTAL | ~37 hours | Medium |
| Service | Usage | Cost |
|---|---|---|
| Route 53 | 1M queries/month | $50 |
| ECS Fargate | 4 vCPU, 8GB RAM | $800 |
| ALB | 2 regional | $300 |
| CloudWatch | 10GB logs | $150 |
| S3 | 500GB standard | $20 |
| Lambda | 100 invocations/day | $50 |
| Bedrock | 1000 invocations/month | $500 |
| DynamoDB | On-demand, light | $50 |
| Data Transfer | 100GB/month | $100 |
| TOTAL | ~$2,020/month |
- Create Route 53 zones
- Deploy ECS clusters
- Set up CloudWatch logging
- Create S3 data lake
- Deploy Lambda functions
- Configure Bedrock integration
- Test threat detection pipeline
- Implement auto-remediation
- Set up alerting (email/Slack/PagerDuty)
- Document runbooks for manual intervention
This architecture provides enterprise-grade DNS operations with intelligent threat detection and automated response.