api.aurous-labs.com follows the same envelope. This page covers the shape, the 5-type taxonomy, and the recommended client-side handling for each class.
The envelope
type— one of the five top-level types (see Taxonomy below). Coarse-grained; useful for branching in error-handling code.code— a specific machine-readable code likemax_tokens_exceeds_hard_cap. Stable across versions; safe to switch on.message— a human-readable explanation. May contain the offending value to help debug.doc_url— a canonical anchor link tohttps://docs.aurous-labs.com/errors#<code>. Each error code has its own page.request_id—req_<26-char ULID>— quote this when filing a support ticket so we can find your request.param(optional) — the request parameter that failed validation. Present on mostinvalid_requestcodes.status(optional) — the upstream provider’s status when we map a provider error. Present on*_provider_*codes.
400 / 401 / 403 / 404 / 409 / 422 / 429 / 5xx follow REST conventions. The type is correlated but not perfectly aligned (e.g. some 422s are type: invalid_request).
Taxonomy
Errors are bucketed into 5 top-level types:type | HTTP statuses | Meaning |
|---|---|---|
invalid_request | 400, 422 | Your request was malformed or violated a constraint |
authentication | 401, 403 | Your API key is missing, expired, or lacks scope |
not_found | 404 | The resource doesn’t exist (or you can’t see it) |
rate_limit | 429 | You hit a per-team rate cap |
server_error | 500, 502, 503 | Something on our side failed |
Retry decision tree
For each error class:invalid_request (400 / 422) — DO NOT RETRY
Retrying with the same body will get the same 400. Look at code + param + message and fix the request. Common offenders:
max_tokens_exceeds_hard_cap— dropmax_tokensbelow the model’s hard cap (seeaurous_metadata.hard_max_tokenson the model)model_wrong_kind— you’re hitting a chat endpoint with an embedding model, or vice versaembeddings_input_too_many_items— trim parts; see Embedding limitsembeddings_input_too_large— text + visual + video sum exceeded 128K tokensembeddings_batch_not_supported— you passedinput: ["a","b","c"]; see OpenAI batch incompatidempotency_key_in_use— same key, different body; use a fresh keyresponse_format_too_deep/response_format_too_large— simplify your JSON schema
authentication (401 / 403) — STOP AND INVESTIGATE
Either:
- 401
invalid_api_key— the key doesn’t authenticate. Check it’s stored correctly; check it hasn’t been rotated/deactivated; check whitespace. - 403
insufficient_scope— the key authenticates but doesn’t have the required scope for this endpoint. Mint a new key with appropriate scope (fullfor write,readfor read-only).
not_found (404) — STOP
model_not_found— the model slug doesn’t exist; fetchGET /v1/modelsto see what’s availablechat_cancel_target_not_found— thecmp_<id>doesn’t exist OR belongs to a different team (we treat the second as 404 for disclosure safety)resource_not_found— generic 404 for any other lookup
rate_limit (429) — BACK OFF + RETRY
Two rate-limit buckets exist:
- RPM (requests per minute) —
X-RateLimit-Limit / Remaining / Resetheaders - TPM (tokens per minute) —
X-RateLimit-TPM-Limit / Remaining / Resetheaders (chat + embedding only)
Retry-After header (seconds). Wait that long, then retry. The OpenAI SDK’s built-in retry honors Retry-After automatically.
server_error (500 / 502 / 503) — RETRY WITH JITTER
Two sub-classes:
- Provider-relayed (
chat_provider_unavailable,chat_provider_rate_limited,embeddings_provider_unknown_error) — the upstream provider is blipping. Will usually clear within 30 seconds. - Platform (
internal_error, generic 500) — something broke on our side. File a support ticket with therequest_id; we’ll investigate.
Idempotency-Key: <uuid> on every retry of the same logical operation. See Idempotency.
Standard headers on errors
Every error response carries the same headers as a successful response:Aurous-Request-Id—req_<ulid>for supportAurous-Version— the API contract version appliedX-RateLimit-*— rate-limit headers (even on 429 — that’s how you know the bucket state)
Sample handler (production-grade)
Where to next?
- Errors index — every error code, every page
- Idempotency — safe-retry contract for writes
- Rate limits — RPM + TPM buckets, headers, Retry-After
- How we count tokens — budget input tokens to avoid
embeddings_input_too_large - Cost transparency — receipt math + reconciliation

