Budget Enforcement¶

Set daily and monthly spend limits per API key, per agent, or per tenant. When exceeded, requests are blocked before they reach the LLM provider.

How It Works¶

Request arrives at the proxy
Budget manager checks current spend against limits
If over limit: return 429 immediately (no API call, no cost)
If approaching limit (soft limit): add warning header, forward request
If under limit: forward request normally

Configuration¶

budgets:
  default:
    daily_limit_usd: 50.0
    monthly_limit_usd: 500.0
    soft_limit_pct: 0.8        # warn at 80%
    action: "block"            # "block" returns 429, "warn" adds header only
  rules:
    - api_key_pattern: "sk-proj-dev-*"
      daily_limit_usd: 5.0
      monthly_limit_usd: 50.0
      action: "block"
    - tenant_id: "alpha"
      daily_limit_usd: 100.0
      monthly_limit_usd: 1000.0
      action: "block"

Block Response¶

When a limit is hit:

{
  "error": {
    "type": "budget_exceeded",
    "message": "spending limit exceeded",
    "daily_spent": 12.50,
    "daily_limit": 10.00,
    "monthly_spent": 45.00,
    "monthly_limit": 500.00
  }
}

HTTP status: 429 Too Many Requests

Soft Limits¶

When soft_limit_pct is configured, AgentLedger adds a response header when approaching the threshold:

X-AgentLedger-Budget-Warning: daily spend at 82% of limit

The request is still forwarded — soft limits are informational only.

Pre-Flight Estimation¶

AgentLedger calculates worst-case cost from max_tokens before forwarding to the API. If the estimated cost would exceed the remaining budget, the request is rejected immediately — no wasted spend.

Per-Key Rules¶

Rules use glob patterns to match API keys:

Pattern	Matches
`sk-proj-dev-*`	All keys starting with `sk-proj-dev-`
`sk-*`	All keys starting with `sk-`
`*`	All keys

Rules are evaluated in order. The first matching rule wins. If no rule matches, the default applies.

Runtime Management¶

Budget rules can be managed at runtime via the Admin API without restarting the proxy.