Budget Enforcement¶
Set daily and monthly spend limits per API key, per agent, or per tenant. When exceeded, requests are blocked before they reach the LLM provider.
How It Works¶
- Request arrives at the proxy
- Budget manager checks current spend against limits
- If over limit: return
429immediately (no API call, no cost) - If approaching limit (soft limit): add warning header, forward request
- If under limit: forward request normally
Configuration¶
budgets:
default:
daily_limit_usd: 50.0
monthly_limit_usd: 500.0
soft_limit_pct: 0.8 # warn at 80%
action: "block" # "block" returns 429, "warn" adds header only
rules:
- api_key_pattern: "sk-proj-dev-*"
daily_limit_usd: 5.0
monthly_limit_usd: 50.0
action: "block"
- tenant_id: "alpha"
daily_limit_usd: 100.0
monthly_limit_usd: 1000.0
action: "block"
Block Response¶
When a limit is hit:
{
"error": {
"type": "budget_exceeded",
"message": "spending limit exceeded",
"daily_spent": 12.50,
"daily_limit": 10.00,
"monthly_spent": 45.00,
"monthly_limit": 500.00
}
}
HTTP status: 429 Too Many Requests
Soft Limits¶
When soft_limit_pct is configured, AgentLedger adds a response header when approaching the threshold:
The request is still forwarded — soft limits are informational only.
Pre-Flight Estimation¶
AgentLedger calculates worst-case cost from max_tokens before forwarding to the API. If the estimated cost would exceed the remaining budget, the request is rejected immediately — no wasted spend.
Per-Key Rules¶
Rules use glob patterns to match API keys:
| Pattern | Matches |
|---|---|
sk-proj-dev-* | All keys starting with sk-proj-dev- |
sk-* | All keys starting with sk- |
* | All keys |
Rules are evaluated in order. The first matching rule wins. If no rule matches, the default applies.
Runtime Management¶
Budget rules can be managed at runtime via the Admin API without restarting the proxy.