This project provides a complete Terraform setup to expose Claude AI models through AWS Bedrock via a REST API. All usage is billed directly through AWS, eliminating the need for separate Anthropic API credits.
Client → API Gateway → Lambda → AWS Bedrock → Claude Models
(with API Key) (IAM Auth) (AWS Billing)
- API Gateway: RESTful endpoint with optional API key authentication
- Lambda: Python function that invokes Bedrock models
- IAM Roles/Policies: Secure access to Bedrock services
- CloudWatch Logs: Request and response logging
- Usage Plans: Rate limiting and quota management
- ✅ AWS-native billing - All costs through your AWS account
- ✅ Multiple Claude models - Support for Claude 3.5 Sonnet, Haiku, Opus, and Claude 3
- ✅ API key authentication - Secure your endpoint
- ✅ Rate limiting - Built-in throttling and quotas
- ✅ Streaming support - Ready for streaming responses (with minor modifications)
- ✅ Comprehensive logging - CloudWatch integration
- ✅ Easy customization - Full Terraform variables
| Model ID | Description | Use Case |
|---|---|---|
anthropic.claude-3-5-sonnet-20241022-v2:0 |
Most intelligent model | Complex tasks, analysis |
anthropic.claude-3-5-haiku-20241022-v1:0 |
Fastest, most affordable | Quick responses, high volume |
anthropic.claude-3-opus-20240229-v1:0 |
Previous flagship | Legacy applications |
anthropic.claude-3-sonnet-20240229-v1:0 |
Balanced performance | General purpose |
- AWS Account with Bedrock access enabled
- Terraform >= 1.0
- AWS CLI configured with appropriate credentials
- Bedrock Model Access - Request access to Claude models in AWS Console
- Go to AWS Console → Bedrock → Model Access
- Request access to Anthropic Claude models
- Wait for approval (usually instant for Claude 3.5)
# Navigate to the project directory
cd bedrock-claude-api
# Copy example variables
cp terraform.tfvars.example terraform.tfvars
# Edit variables as needed
vim terraform.tfvars# Initialize Terraform
terraform init
# Review planned changes
terraform plan
# Deploy
terraform apply# Get API endpoint
terraform output api_endpoint
# Get API key (if enabled)
terraform output -raw api_key# Using curl
curl -X POST "$(terraform output -raw api_endpoint)" \
-H "Content-Type: application/json" \
-H "x-api-key: $(terraform output -raw api_key)" \
-d '{
"messages": [
{"role": "user", "content": "Hello! Explain AWS Bedrock briefly."}
],
"max_tokens": 200
}'{
"messages": [
{
"role": "user",
"content": "Your message here"
}
],
"model": "anthropic.claude-3-5-sonnet-20241022-v2:0",
"max_tokens": 4096,
"temperature": 1.0,
"system": "Optional system prompt"
}{
"content": [
{
"type": "text",
"text": "Response from Claude"
}
],
"model": "anthropic.claude-3-5-sonnet-20241022-v2:0",
"stop_reason": "end_turn",
"usage": {
"input_tokens": 15,
"output_tokens": 50,
"total_tokens": 65
}
}{
"messages": [
{"role": "user", "content": "What is the capital of France?"},
{"role": "assistant", "content": "The capital of France is Paris."},
{"role": "user", "content": "What is its population?"}
]
}See the examples/ directory for:
- test_api.sh - Bash script with multiple test scenarios
- python_client.py - Python client library with examples
from bedrock_client import BedrockClaudeClient
client = BedrockClaudeClient(
api_endpoint="https://xxxxx.execute-api.us-east-1.amazonaws.com/prod/invoke",
api_key="your-api-key"
)
response = client.simple_chat(
"Explain AWS Lambda in one sentence.",
max_tokens=100
)
print(response)Claude 3.5 Sonnet v2:
- Input: $3.00 / 1M tokens
- Output: $15.00 / 1M tokens
Claude 3.5 Haiku:
- Input: $1.00 / 1M tokens
- Output: $5.00 / 1M tokens
- Use Haiku for simple tasks - 3x cheaper than Sonnet
- Set appropriate max_tokens - Avoid paying for unused tokens
- Implement caching - Cache similar requests at application level
- Monitor usage - Use CloudWatch metrics and Cost Explorer
- Set quotas - Configure usage plans to prevent runaway costs
For 1,000 requests/day with Claude 3.5 Sonnet:
- Average 500 input tokens, 300 output tokens per request
- Input cost: (500 × 1,000 × 30) ÷ 1,000,000 × $3 = $45/month
- Output cost: (300 × 1,000 × 30) ÷ 1,000,000 × $15 = $135/month
- Total: ~$180/month
Switching to Haiku reduces this to ~$54/month (70% savings)
| Variable | Description | Default |
|---|---|---|
aws_region |
AWS region | us-east-1 |
project_name |
Resource naming prefix | bedrock-claude-api |
default_model_id |
Default Claude model | Claude 3.5 Sonnet v2 |
max_tokens |
Max response tokens | 4096 |
enable_api_key |
Enable API key auth | true |
quota_limit |
Daily request limit | 10000 |
throttle_rate_limit |
Requests per second | 50 |
log_retention_days |
CloudWatch retention | 7 |
- Enable API Key - Always use API key authentication in production
- Rotate Keys - Regularly rotate API keys
- Use VPC Endpoints - For internal applications, use VPC endpoints
- Enable CloudTrail - Log all API activity
- Set Rate Limits - Prevent abuse with usage plans
- Monitor Costs - Set up billing alerts
- Least Privilege IAM - Lambda has minimal required permissions
- Lambda invocations, duration, errors
- API Gateway request count, latency, 4xx/5xx errors
- Bedrock invocation metrics
- Request/response logging in
/aws/lambda/bedrock-claude-api - API Gateway access logs in
/aws/apigateway/bedrock-claude-api
# View Bedrock costs
aws ce get-cost-and-usage \
--time-period Start=2024-12-01,End=2024-12-31 \
--granularity DAILY \
--metrics BlendedCost \
--filter file://bedrock-filter.jsonSolution: Ensure you've requested model access in Bedrock console
aws bedrock list-foundation-models --region us-east-1 | grep claudeSolutions:
- Use Claude 3.5 Haiku for faster responses
- Reduce
max_tokens - Consider implementing caching
- Use streaming responses (requires code modification)
Solutions:
- Implement exponential backoff in client
- Increase
throttle_rate_limitin terraform - Request limit increase from AWS Support
Solutions:
- Check CloudWatch logs for unusual usage patterns
- Review
max_tokenssettings - Switch to Haiku for appropriate workloads
- Implement request caching
Modify Lambda to use invoke_model_with_response_stream:
response = bedrock_runtime.invoke_model_with_response_stream(
modelId=model_id,
body=json.dumps(bedrock_request)
)Already included in Lambda response headers. Customize as needed:
'Access-Control-Allow-Origin': '*' # Change to specific domainAdd Route53 and ACM certificate:
resource "aws_api_gateway_domain_name" "api" {
domain_name = "api.yourdomain.com"
certificate_arn = aws_acm_certificate.cert.arn
}To destroy all resources:
terraform destroyWarning: This will delete all resources including logs. Export any important logs first.
Contributions welcome! Please submit PRs or issues.
MIT License - See LICENSE file
For issues:
- Check CloudWatch logs
- Review IAM permissions
- Verify Bedrock model access
- Check AWS Service Health Dashboard
- Open GitHub issue with logs (redact sensitive info)