Sign In

Rate Limits & Quotas

Two independent budgets govern every call: a per-route token bucket (QPS) and a plan-level credit quota (monthly spend). Both surface as HTTP 429 with explicit Retry-After guidance.

Plan budgets

PlanReq / minuteConcurrent jobsMonthly credits
Free601200
Developer60045 000
Growth1 8001650 000
Enterprisecustomcustompooled

Per-route ceilings

Even on unlimited-plan tenants, each route carries a safety ceiling to prevent queue-bomb or polling-loop abuse. Keep long-running jobs asynchronous — don't poll in a tight loop.

RouteBudgetNotes
POST /v1/client/jobs1 request / sec / tenantJob submission — controlled to prevent queue-bomb DoS.
POST /v1/client/datasets/upload10 req / min / tenantCovers multipart + profiling; larger datasets bill per MB.
POST /v1/client/seals60 req / min / tenantMint new sealed contract seal.
GET /v1/client/jobs/:id60 req / min / keyPolling — use webhooks where possible to avoid burning this budget.
POST /v1/client/agent/projects10 req / min / tenantADS project creation. Plan + approve + stream reuse the same budget.
GET /v1/client/evidence/:id30 req / min / keyBundle download. Cache locally; bundles are immutable.
POST /v1/client/webhooks10 req / min / tenantWebhook registration / rotation.

Response headers

Every response includes rate-limit telemetry so a well-behaved client can pace itself without ever hitting 429.

HTTP/1.1 200 OK
X-RateLimit-Limit:       60
X-RateLimit-Remaining:   57
X-RateLimit-Reset:       1700003600
X-RateLimit-Route:       GET /v1/client/jobs/:id
Content-Type:            application/json
  • Limit — max allowed per window
  • Remaining — tokens left in the current window
  • Reset — Unix timestamp when the bucket refills
  • Route — which budget was charged

Handling 429

When a budget trips, the response carries a Retry-Afterheader (delta-seconds or HTTP-date) and an error_code of RATE_LIMITED. The SDK already respects the header; a hand-written client must honour it.

# The SDK uses Retry-After when the server sets it, falling back to
# exponential backoff with a 1s floor. Nothing to configure.
from radmah_sdk import RadMahClient
client = RadMahClient(api_key="sl_live_…", max_retries=5)
for jid in jobs:
    try:
        status = client.jobs.get(jid)   # polls safely under backpressure
    except RadMah AIError as exc:
        if exc.status_code == 429:
            # Every automatic retry exhausted. Quota is persistently
            # exhausted — move to a webhook subscription instead of polling.
            schedule_switch_to_webhooks()

Patterns that burn budget

  • Tight polling on GET /jobs/:id. Prefer webhooks; if you must poll, back off exponentially once the job enters a non-terminal state.
  • Parallel submit loops that ignore concurrent_jobs. Use client.batch_create_jobs(), which respects the ceiling server-side.
  • Synchronous ADS streaming across dozens of keys. Open one streaming connection per project; don't re-establish on every turn.
Need a higher ceiling? Enterprise plans negotiate custom per-route budgets with pooled credits across a tenant's keys — contact sales with your steady-state QPS + burst peak.