Self-hosted API gateway for Claude Code on Amazon Bedrock
Production deployment guide for CCAG, an API gateway that translates the Anthropic Messages API to Amazon Bedrock.
aws sts get-caller-identity should succeed)npm install -g aws-cdk) or use npx cdk from the infra/ directory Internet
│
┌──────▼──────┐
│ ALB │
│ (HTTPS/443) │
└──────┬───────┘
│
┌────────▼────────┐
│ ECS Fargate │
│ ARM64 (Graviton)│───── Amazon Bedrock
│ x N tasks │ (Claude models)
└────────┬─────────┘
│
┌────────▼─────────┐
│ RDS PostgreSQL │
│ (db.t4g.small) │
└──────────────────┘
The CDK stack provisions the following resources:
| Category | Resources |
|---|---|
| Networking | VPC (2 AZs), public subnets, private subnets, NAT gateway (1), internet gateway |
| Compute | ECS cluster (Container Insights enabled), Fargate service, task definition (ARM64, 0.5 vCPU, 1 GB) |
| Load Balancing | Application Load Balancer (public), target group, HTTPS listener (with cert) or HTTP listener |
| Database | RDS PostgreSQL 16 (db.t4g.small, 20 GB gp2, auto-scaling to 100 GB, encrypted, 7-day backups, Performance Insights) |
| DNS/TLS | Route53 A record (if domain configured), ACM certificate (auto-created via DNS validation, or bring your own) |
| Secrets | Secrets Manager: DB credentials (auto-generated), API key (auto-generated), Admin key (auto-generated) |
| Autoscaling | Target tracking on CPU (70%) and memory (80%), min = desiredCount, max = desiredCount * 5 |
| Monitoring | CloudWatch log group (1 month retention), 8 CloudWatch alarms, SNS alarm topic, optional webhook forwarding via EventBridge |
| IAM | Task role with Bedrock invoke + Service Quotas read, execution role with ECR pull |
For a minimal deployment (1 task, db.t4g.small, 1 NAT gateway):
Total: ~$107/month (excluding Bedrock API usage and data transfer)
cd claude-code-aws-gateway
cd infra
npm install
environments.jsonCreate environments.json in the project root (not in infra/). This is the single source of truth for deployment configuration.
{
"region": "us-west-2",
"stack_name": "CCAG",
"prod": {
"account_id": "123456789012",
"domain_name": "ccag.example.com",
"hosted_zone_name": "example.com",
"certificate_arn": null,
"admin_users": "user@example.com",
"desired_count": 2
}
}
Note:
ecr_repo_nameis omitted — the stack pulls from GHCR by default. Only add it if you need private ECR images (see Using ECR Instead below).
See Configuration below for all available fields.
cd infra
npx cdk bootstrap aws://123456789012/us-west-2 -c environment=prod
Replace prod with your environment name from environments.json if different.
cd infra
npx cdk deploy -c environment=prod -c imageTag=1.0.0
Use a version tag from GitHub Releases.
CDK will show the resources it plans to create. Review and confirm.
After deployment, CDK prints several outputs:
aws secretsmanager get-secret-value --secret-id <ApiKeySecretArn> --query SecretString --output text
aws sns subscribe --topic-arn <AlarmTopicArn> --protocol email --notification-endpoint you@example.com
Log in to the admin portal at https://your-domain.com/portal (or the PortalUrl from step 6) and navigate to the Connect page. It provides a setup script that configures Claude Code automatically.
environments.json ReferenceThe file is read by infra/app.ts. Top-level fields apply to all environments; per-environment fields are nested under the environment name.
Top-level fields:
| Field | Required | Description |
|---|---|---|
region |
Yes | AWS region for the stack (must have Bedrock Claude access) |
ecr_repo_name |
No | ECR repository name (e.g., ccag). If omitted, pulls from GHCR (recommended for most users) |
stack_name |
No | CloudFormation stack name (default: CCAG) |
Per-environment fields (nested under staging, prod, or any name you choose):
| Field | Required | Description |
|---|---|---|
account_id |
Yes | AWS account ID |
desired_count |
Yes | Number of ECS tasks (min capacity for autoscaling) |
domain_name |
No | Full domain name (e.g., ccag.example.com). Enables HTTPS, Route53 record, and sets OIDC_AUDIENCE |
hosted_zone_name |
No | Route53 hosted zone (e.g., example.com). Required for auto-created ACM cert and DNS record |
certificate_arn |
No | Existing ACM certificate ARN. If omitted and domain_name + hosted_zone_name are set, a cert is auto-created via DNS validation |
admin_users |
No | Comma-separated OIDC subjects auto-provisioned as admin users |
admin_password |
No | Admin login password (default: admin). Change this for production. |
rds_iam_auth |
No | Use IAM authentication for RDS instead of Secrets Manager password (default: false). Requires a manual GRANT rds_iam TO proxy; after first deploy. |
allowed_cidrs |
No | JSON array of CIDR blocks allowed to reach the ALB (e.g., ["203.0.113.0/24"]). When null or omitted, ALB is open to all sources. |
How TLS is determined:
certificate_arn is provided, that cert is used with HTTPS on port 443.domain_name and hosted_zone_name are set (but no certificate_arn), an ACM cert is auto-created via DNS validation.Pass these with -c key=value on the command line:
| Parameter | Description |
|---|---|
environment |
Which environment block to use from environments.json (default: prod) |
imageTag |
ECR image tag to deploy (e.g., abcd1234) |
imageDigest |
ECR image digest (sha256:...). Preferred over imageTag to avoid no-op deploys |
rdsIamAuth |
Use IAM auth for RDS (true/false). Overrides rds_iam_auth in environments.json. |
alarmWebhookUrl |
Webhook URL for alarm notifications (Slack, etc.) via EventBridge API destination |
The alarmWebhookUrl can also be set via the ALARM_WEBHOOK_URL environment variable.
These are set automatically by the CDK stack on the Fargate tasks:
| Variable | Source | Description |
|---|---|---|
PROXY_HOST |
Hardcoded 0.0.0.0 |
Listen address |
PROXY_PORT |
Hardcoded 8080 |
Listen port |
RUST_LOG |
Hardcoded info |
Log level |
LOG_FORMAT |
Hardcoded json |
Structured logging for CloudWatch |
DATABASE_HOST |
From RDS endpoint | Postgres host |
DATABASE_PORT |
From RDS endpoint | Postgres port |
DATABASE_NAME |
Hardcoded proxy |
Postgres database name |
DATABASE_USER |
Hardcoded proxy |
Postgres username |
DB_PASSWORD |
From Secrets Manager (default) | Postgres password (injected as ECS secret). Only set when using Secrets Manager auth (the default). |
RDS_IAM_AUTH |
CDK context flag | Set to true when deployed with -c rdsIamAuth=true. Uses IAM auth tokens instead of password. |
OIDC_AUDIENCE |
From domain_name |
OIDC JWT audience claim (if domain configured) |
ADMIN_USERS |
From admin_users |
OIDC subjects with admin access |
ADMIN_PASSWORD |
From admin_password |
Admin login password (if set in environments.json) |
To add additional environment variables (e.g., OIDC_ISSUER, OTEL_EXPORTER_OTLP_ENDPOINT), modify the environment block in infra/stack.ts where the container is defined.
You can define multiple environments in environments.json:
{
"region": "us-west-2",
"ecr_repo_name": "ccag",
"staging": {
"account_id": "111111111111",
"desired_count": 1,
"domain_name": "ccag-staging.example.com",
"hosted_zone_name": "example.com"
},
"prod": {
"account_id": "222222222222",
"desired_count": 2,
"domain_name": "ccag.example.com",
"hosted_zone_name": "example.com"
}
}
Deploy to a specific environment:
npx cdk deploy -c environment=staging -c imageTag=1.0.0
npx cdk deploy -c environment=prod -c imageTag=1.0.0
Check for new releases at GitHub Releases. Review the release notes for breaking changes.
cd infra
npx cdk deploy -c environment=prod -c imageTag=1.1.0
CloudFormation computes the delta and only updates changed resources. ECS performs a rolling deployment: new tasks start, pass health checks, then old tasks drain.
Database migrations run automatically on application startup. No manual migration step is needed.
npx cdk deploy -c environment=prod -c imageTag=1.0.0
The stack creates these alarms, all publishing to the SNS alarm topic:
| Alarm | Condition | Description |
|---|---|---|
| ALB 5xx | > 5 in 5 min | ALB-generated errors (e.g., 504 timeouts) |
| Target 5xx | > 10 in 5 min | Application-generated 5xx errors |
| Unhealthy Targets | >= 1 for 2 min | ECS tasks failing health checks |
| High Latency | p99 > 120s for 15 min | Sustained extreme response times |
| DB CPU | > 80% for 15 min | RDS CPU utilization |
| DB Storage | < 2 GB | RDS free storage space |
| DB Connections | > 80 for 10 min | Approaching RDS connection limit (~120 for t4g.small) |
| App Errors | > 5 in 5 min | Log-based: ERROR or panic in application logs |
Subscribe to alarms via email:
aws sns subscribe --topic-arn <AlarmTopicArn> --protocol email --notification-endpoint you@example.com
Webhook notifications (Slack, PagerDuty, etc.): Pass alarmWebhookUrl during deploy. CloudWatch alarm state changes are forwarded via EventBridge to your webhook as JSON:
{
"alarmName": "CCAG-Alb5xxAlarm",
"state": "ALARM",
"reason": "Threshold crossed: 8 > 5",
"description": "ALB is generating 5xx errors",
"timestamp": "2025-01-15T10:30:00Z"
}
ccag)aws logs tail /aws/ecs/ccag --follow
Or find the log group name in the CloudFormation stack outputs/resources.
| Endpoint | Description |
|---|---|
GET /health |
Basic liveness check (used by ALB target group, 15s interval) |
The gateway exposes Prometheus metrics at GET /metrics. To scrape these, configure a Prometheus instance or use Amazon Managed Prometheus with an ECS sidecar.
To export metrics via OTLP, set OTEL_EXPORTER_OTLP_ENDPOINT in the container environment.
aws rds modify-db-instance \
--db-instance-identifier <instance-id> \
--no-deletion-protection
cd infra
npx cdk destroy -c environment=prod
Note: RDS has removalPolicy: SNAPSHOT, so a final snapshot is created before deletion.
aws ecr delete-repository --repository-name ccag --force
cdk destroy due to RETAIN policy):
# List CCAG-related log groups
aws logs describe-log-groups --query "logGroups[?contains(logGroupName, 'CCAG')].logGroupName" --output table
# Delete each one
aws logs delete-log-group --log-group-name <log-group-name>
Log groups that may remain: application log group (RETAIN policy), Container Insights performance logs, and RDS PostgreSQL export logs.
# List snapshots
aws rds describe-db-snapshots --query "DBSnapshots[?contains(DBSnapshotIdentifier, 'ccag') || contains(DBSnapshotIdentifier, 'CCAG')].{ID:DBSnapshotIdentifier,Size:AllocatedStorage}" --output table
# Delete each snapshot
aws rds delete-db-snapshot --db-snapshot-identifier <snapshot-id>
aws cloudformation delete-stack --stack-name CDKToolkit
Symptom: Gateway returns 403 or “access denied” errors from Bedrock.
Fix: Ensure Bedrock model access is enabled in your AWS account for the target region. Go to the Bedrock console > Model access > Enable the Claude models you need. Note that newer models (Claude 4.5, 4.6) require inference profiles, which the gateway handles automatically.
Symptom: Model not found or empty responses.
Fix: The gateway auto-detects model routing prefixes from the AWS region. Ensure your region in environments.json matches a region where Claude models are available. Supported regions include us-west-2, us-east-1, eu-west-1, ap-southeast-2, and others.
Symptom: Desired count > 0 but no running tasks.
Check:
# View stopped task reasons
aws ecs describe-tasks --cluster <cluster-name> \
--tasks $(aws ecs list-tasks --cluster <cluster-name> --desired-status STOPPED --query 'taskArns[0]' --output text) \
--query 'tasks[0].{reason:stoppedReason,status:lastStatus}'
# View task logs
aws logs tail <log-group-name> --since 30m
Common causes:
The stack enables ECS Exec on Fargate tasks. The helper script fetches DB credentials locally (IAM auth token or Secrets Manager password) and runs psql inside the container:
.claude/scripts/bastion.sh connect --env prod # Interactive psql
.claude/scripts/bastion.sh query --env prod 'SELECT count(*) FROM api_keys'
.claude/scripts/bastion.sh shell --env prod # Interactive shell
Symptom: Intermittent 504 Gateway Timeout on long-running requests.
Context: The ALB idle timeout is set to 900 seconds (15 minutes) to accommodate streaming responses from thinking models, which can have long pauses before the first chunk. If you still see 504s, the Bedrock request itself may be timing out.
If a deployment fails, CloudFormation automatically rolls back. To investigate:
aws cloudformation describe-stack-events --stack-name CCAG \
--query 'StackEvents[?ResourceStatus==`CREATE_FAILED` || ResourceStatus==`UPDATE_FAILED`]'