Self-hosted API gateway for Claude Code on Amazon Bedrock. Team API keys, per-user budgets, OIDC SSO, rate limiting, and an admin portal.
CCAG exposes operational metrics via two channels:
GET /metrics (requires admin auth)Both channels observe the same instruments. Enabling one does not disable the other.
GET /metrics
Authorization: Bearer <admin-token>
Returns metrics in Prometheus text exposition format (text/plain; version=0.0.4).
scrape_configs:
- job_name: ccag
scrape_interval: 30s
scheme: https
metrics_path: /metrics
authorization:
credentials: "<admin-session-token-or-api-key>"
static_configs:
- targets: ["ccag.example.com"]
Set the OTEL_EXPORTER_OTLP_ENDPOINT environment variable to enable gRPC metric push:
OTEL_EXPORTER_OTLP_ENDPOINT=http://otel-collector:4317
Metrics are exported every 60 seconds. The exporter uses gRPC (Tonic). Both Prometheus and OTLP can run simultaneously.
All metrics use the ccag prefix (dots in instrument names become underscores in Prometheus).
| Instrument | Type | Labels | Description |
|---|---|---|---|
ccag.requests.total |
Counter | model, streaming, status |
Total proxy requests |
ccag.request.duration_ms |
Histogram | model, streaming, status |
Request duration in milliseconds |
ccag.requests.in_flight |
UpDownCounter | Currently in-flight requests |
| Instrument | Type | Labels | Description |
|---|---|---|---|
ccag.tokens.input |
Counter | model |
Total input tokens processed |
ccag.tokens.output |
Counter | model |
Total output tokens generated |
ccag.tokens.cache_read |
Counter | model |
Cache read input tokens |
ccag.tokens.cache_write |
Counter | model |
Cache write (creation) tokens |
| Instrument | Type | Labels | Description |
|---|---|---|---|
ccag.tool_calls.total |
Counter | tool, type |
Tool calls observed (type: builtin or mcp) |
ccag.web_searches.total |
Counter | Web searches executed via interception |
| Instrument | Type | Labels | Description |
|---|---|---|---|
ccag.errors.total |
Counter | error_type, endpoint_id |
Bedrock/upstream errors |
ccag.bedrock.throttles.total |
Counter | model, endpoint_id |
Bedrock throttling events |
ccag.rate_limits.total |
Counter | Gateway rate limit rejections | |
ccag.auth_failures.total |
Counter | reason |
Authentication failures |
| Instrument | Type | Labels | Description |
|---|---|---|---|
ccag.spend_flush_errors.total |
Counter | Spend tracker flush failures | |
ccag.spend_records_quarantined.total |
Counter | Spend records dropped after individual insert failed with a data-rejection error (e.g. SQLSTATE 22P05). Non-zero indicates real spend data was discarded; alarm on this. |
Import a basic dashboard by creating a new dashboard and adding these panels:
rate(ccag_requests_total[5m])rate(ccag_errors_total[5m])rate(ccag_bedrock_throttles_total[5m]) (group by endpoint_id)ccag_requests_in_flighthistogram_quantile(0.99, rate(ccag_request_duration_ms_bucket[5m]))rate(ccag_tokens_input[5m]) + rate(ccag_tokens_output[5m])