Self-hosted API gateway for Claude Code on Amazon Bedrock
CCAG exposes operational metrics via two channels:
GET /metrics (requires admin auth)Both channels observe the same instruments. Enabling one does not disable the other.
GET /metrics
Authorization: Bearer <admin-token>
Returns metrics in Prometheus text exposition format (text/plain; version=0.0.4).
scrape_configs:
- job_name: ccag
scrape_interval: 30s
scheme: https
metrics_path: /metrics
authorization:
credentials: "<admin-session-token-or-api-key>"
static_configs:
- targets: ["ccag.example.com"]
Set the OTEL_EXPORTER_OTLP_ENDPOINT environment variable to enable gRPC metric push:
OTEL_EXPORTER_OTLP_ENDPOINT=http://otel-collector:4317
Metrics are exported every 60 seconds. The exporter uses gRPC (Tonic). Both Prometheus and OTLP can run simultaneously.
All metrics use the ccag prefix (dots in instrument names become underscores in Prometheus).
| Instrument | Type | Labels | Description |
|---|---|---|---|
ccag.requests.total |
Counter | model, streaming, status |
Total proxy requests |
ccag.request.duration_ms |
Histogram | model, streaming, status |
Request duration in milliseconds |
ccag.requests.in_flight |
UpDownCounter | Currently in-flight requests |
| Instrument | Type | Labels | Description |
|---|---|---|---|
ccag.tokens.input |
Counter | model |
Total input tokens processed |
ccag.tokens.output |
Counter | model |
Total output tokens generated |
ccag.tokens.cache_read |
Counter | model |
Cache read input tokens |
ccag.tokens.cache_write |
Counter | model |
Cache write (creation) tokens |
| Instrument | Type | Labels | Description |
|---|---|---|---|
ccag.tool_calls.total |
Counter | tool, type |
Tool calls observed (type: builtin or mcp) |
ccag.web_searches.total |
Counter | Web searches executed via interception |
| Instrument | Type | Labels | Description |
|---|---|---|---|
ccag.errors.total |
Counter | error_type, endpoint_id |
Bedrock/upstream errors |
ccag.bedrock.throttles.total |
Counter | model, endpoint_id |
Bedrock throttling events |
ccag.rate_limits.total |
Counter | Gateway rate limit rejections | |
ccag.auth_failures.total |
Counter | reason |
Authentication failures |
| Instrument | Type | Labels | Description |
|---|---|---|---|
ccag.spend_flush_errors.total |
Counter | Spend tracker flush failures |
Import a basic dashboard by creating a new dashboard and adding these panels:
rate(ccag_requests_total[5m])rate(ccag_errors_total[5m])rate(ccag_bedrock_throttles_total[5m]) (group by endpoint_id)ccag_requests_in_flighthistogram_quantile(0.99, rate(ccag_request_duration_ms_bucket[5m]))rate(ccag_tokens_input[5m]) + rate(ccag_tokens_output[5m])