Rate Limits & Quotas

Two independent budgets govern every call: a per-route token bucket (QPS) and a plan-level credit quota (monthly spend). Both surface as HTTP 429 with explicit Retry-After guidance.

Plan budgets

Plan	Req / minute	Concurrent jobs	Monthly credits
Free	60	1	200
Developer	600	4	5 000
Growth	1 800	16	50 000
Enterprise	custom	custom	pooled

Per-route ceilings

Even on unlimited-plan tenants, each route carries a safety ceiling to prevent queue-bomb or polling-loop abuse. Keep long-running jobs asynchronous — don't poll in a tight loop.

Route	Budget	Notes
POST /v1/client/jobs	1 request / sec / tenant	Job submission — controlled to prevent queue-bomb DoS.
POST /v1/client/datasets/upload	10 req / min / tenant	Covers multipart + profiling; larger datasets bill per MB.
POST /v1/client/seals	60 req / min / tenant	Mint new sealed contract seal.
GET /v1/client/jobs/:id	60 req / min / key	Polling — use webhooks where possible to avoid burning this budget.
POST /v1/client/agent/projects	10 req / min / tenant	ADS project creation. Plan + approve + stream reuse the same budget.
GET /v1/client/evidence/:id	30 req / min / key	Bundle download. Cache locally; bundles are immutable.
POST /v1/client/webhooks	10 req / min / tenant	Webhook registration / rotation.

Response headers

Every response includes rate-limit telemetry so a well-behaved client can pace itself without ever hitting 429.

HTTP/1.1 200 OK
X-RateLimit-Limit:       60
X-RateLimit-Remaining:   57
X-RateLimit-Reset:       1700003600
X-RateLimit-Route:       GET /v1/client/jobs/:id
Content-Type:            application/json

Limit — max allowed per window
Remaining — tokens left in the current window
Reset — Unix timestamp when the bucket refills
Route — which budget was charged

Handling 429

When a budget trips, the response carries a Retry-Afterheader (delta-seconds or HTTP-date) and an error_code of RATE_LIMITED. The SDK already respects the header; a hand-written client must honour it.

# The SDK uses Retry-After when the server sets it, falling back to
# exponential backoff with a 1s floor. Nothing to configure.
from radmah_sdk import RadMahClient
client = RadMahClient(api_key="sl_live_…", max_retries=5)
for jid in jobs:
    try:
        status = client.jobs.get(jid)   # polls safely under backpressure
    except RadMah AIError as exc:
        if exc.status_code == 429:
            # Every automatic retry exhausted. Quota is persistently
            # exhausted — move to a webhook subscription instead of polling.
            schedule_switch_to_webhooks()

Patterns that burn budget

Tight polling on GET /jobs/:id. Prefer webhooks; if you must poll, back off exponentially once the job enters a non-terminal state.
Parallel submit loops that ignore concurrent_jobs. Use client.batch_create_jobs(), which respects the ceiling server-side.
Synchronous ADS streaming across dozens of keys. Open one streaming connection per project; don't re-establish on every turn.

Need a higher ceiling? Enterprise plans negotiate custom per-route budgets with pooled credits across a tenant's keys — contact sales with your steady-state QPS + burst peak.