Error handling

Every error response from api.aurous-labs.com follows the same envelope. This page covers the shape, the 5-type taxonomy, and the recommended client-side handling for each class.

The envelope

{
  "error": {
    "type": "invalid_request",
    "code": "max_tokens_exceeds_hard_cap",
    "message": "max_tokens 16384 exceeds the model's hard cap of 8192 for aurous-grow-2.0-pro. See https://docs.aurous-labs.com/errors#max_tokens_exceeds_hard_cap.",
    "doc_url": "https://docs.aurous-labs.com/errors#max_tokens_exceeds_hard_cap",
    "request_id": "req_01HXMQ7Z3K8Y2ABCDEFGHJKM",
    "param": "max_tokens"
  }
}

The body fields:

type — one of the five top-level types (see Taxonomy below). Coarse-grained; useful for branching in error-handling code.
code — a specific machine-readable code like max_tokens_exceeds_hard_cap. Stable across versions; safe to switch on.
message — a human-readable explanation. May contain the offending value to help debug.
doc_url — a canonical anchor link to https://docs.aurous-labs.com/errors#<code>. Each error code has its own page.
request_id — req_<26-char ULID> — quote this when filing a support ticket so we can find your request.
param (optional) — the request parameter that failed validation. Present on most invalid_request codes.
status (optional) — the upstream provider’s status when we map a provider error. Present on *_provider_* codes.

The HTTP status code is the canonical signal — 400 / 401 / 403 / 404 / 409 / 422 / 429 / 5xx follow REST conventions. The type is correlated but not perfectly aligned (e.g. some 422s are type: invalid_request).

Taxonomy

Errors are bucketed into 5 top-level types:

`type`	HTTP statuses	Meaning
`invalid_request`	400, 422	Your request was malformed or violated a constraint
`authentication`	401, 403	Your API key is missing, expired, or lacks scope
`not_found`	404	The resource doesn’t exist (or you can’t see it)
`rate_limit`	429	You hit a per-team rate cap
`server_error`	500, 502, 503	Something on our side failed

A pattern-match table for fast triage:

def handle_aurous_error(error: dict) -> str:
    code = error["code"]
    err_type = error["type"]

    if err_type == "invalid_request":
        return "fix-and-retry"  # 400: your bug; retrying without changes wastes credits
    if err_type == "authentication":
        return "rotate-or-escalate"  # 401: key issue; check secret store
    if err_type == "not_found":
        return "stop"  # 404: the resource doesn't exist; retrying won't help
    if err_type == "rate_limit":
        return "back-off-and-retry"  # 429: wait Retry-After then retry
    if err_type == "server_error":
        if code.startswith("chat_provider_") or code.startswith("embeddings_provider_"):
            return "retry-with-jitter"  # provider blip; will likely clear
        return "investigate"  # platform issue; file a support ticket
    return "unknown"

Retry decision tree

For each error class:

`invalid_request` (400 / 422) — DO NOT RETRY

Retrying with the same body will get the same 400. Look at code + param + message and fix the request. Common offenders:

max_tokens_exceeds_hard_cap — drop max_tokens below the model’s hard cap (see aurous_metadata.hard_max_tokens on the model)
model_wrong_kind — you’re hitting a chat endpoint with an embedding model, or vice versa
embeddings_input_too_many_items — trim parts; see Embedding limits
embeddings_input_too_large — text + visual + video sum exceeded 128K tokens
embeddings_batch_not_supported — you passed input: ["a","b","c"]; see OpenAI batch incompat
idempotency_key_in_use — same key, different body; use a fresh key
response_format_too_deep / response_format_too_large — simplify your JSON schema

`authentication` (401 / 403) — STOP AND INVESTIGATE

Either:

401 invalid_api_key — the key doesn’t authenticate. Check it’s stored correctly; check it hasn’t been rotated/deactivated; check whitespace.
403 insufficient_scope — the key authenticates but doesn’t have the required scope for this endpoint. Mint a new key with appropriate scope (full for write, read for read-only).

In both cases, retrying without fixing the key will get the same 401/403. Don’t burn retries.

`not_found` (404) — STOP

model_not_found — the model slug doesn’t exist; fetch GET /v1/models to see what’s available
chat_cancel_target_not_found — the cmp_<id> doesn’t exist OR belongs to a different team (we treat the second as 404 for disclosure safety)
resource_not_found — generic 404 for any other lookup

Retrying won’t help. Either the id is wrong or the resource has been deleted.

`rate_limit` (429) — BACK OFF + RETRY

Two rate-limit buckets exist:

RPM (requests per minute) — X-RateLimit-Limit / Remaining / Reset headers
TPM (tokens per minute) — X-RateLimit-TPM-Limit / Remaining / Reset headers (chat + embedding only)

On 429, the response carries a Retry-After header (seconds). Wait that long, then retry. The OpenAI SDK’s built-in retry honors Retry-After automatically.

import time

def chat_with_backoff(messages, max_retries=5):
    for attempt in range(max_retries):
        try:
            return client.chat.completions.create(
                model="aurous-grow-2.0-pro",
                messages=messages,
                max_tokens=512,
            )
        except openai.RateLimitError as e:
            retry_after = float(e.response.headers.get("retry-after", 2 ** attempt))
            time.sleep(min(retry_after, 30))  # cap at 30s
    raise RuntimeError("Exhausted retries")

See Rate limits for the per-team caps and how to interpret the headers.

`server_error` (500 / 502 / 503) — RETRY WITH JITTER

Two sub-classes:

Provider-relayed (chat_provider_unavailable, chat_provider_rate_limited, embeddings_provider_unknown_error) — the upstream provider is blipping. Will usually clear within 30 seconds.
Platform (internal_error, generic 500) — something broke on our side. File a support ticket with the request_id; we’ll investigate.

For both, exponential backoff with jitter is the recommended retry pattern:

import random, time

def with_backoff(fn, max_retries=5, base=1.0, cap=32.0):
    for attempt in range(max_retries):
        try:
            return fn()
        except openai.APIError as e:
            if e.status_code >= 500:
                sleep_s = min(base * (2 ** attempt) + random.random(), cap)
                time.sleep(sleep_s)
                continue
            raise  # non-5xx errors don't retry
    raise RuntimeError("Exhausted retries")

Idempotency keys are critical here — without them, a retry that succeeds on the server side but failed to deliver the response over the network will double-charge. Use an Idempotency-Key: <uuid> on every retry of the same logical operation. See Idempotency.

Standard headers on errors

Every error response carries the same headers as a successful response:

Aurous-Request-Id — req_<ulid> for support
Aurous-Version — the API contract version applied
X-RateLimit-* — rate-limit headers (even on 429 — that’s how you know the bucket state)

Sample handler (production-grade)

import openai
import time
import random
import logging

class AurousClient:
    def __init__(self, api_key: str):
        self.client = openai.OpenAI(
            base_url="https://api.aurous-labs.com/v1",
            api_key=api_key,
        )
        self.log = logging.getLogger(__name__)

    def chat(self, messages, idempotency_key: str | None = None, max_retries: int = 5):
        for attempt in range(max_retries):
            try:
                kwargs = {}
                if idempotency_key:
                    kwargs["extra_headers"] = {"Idempotency-Key": idempotency_key}
                return self.client.chat.completions.create(
                    model="aurous-grow-2.0-pro",
                    messages=messages,
                    max_tokens=512,
                    **kwargs,
                )
            except openai.BadRequestError as e:
                # 400 — fix-and-retry; don't loop
                self.log.error(f"Aurous 400: {e.body.get('error', {}).get('code')} request_id={e.headers.get('aurous-request-id')}")
                raise
            except openai.AuthenticationError as e:
                # 401 — fix-and-escalate
                self.log.error(f"Aurous 401: rotate the API key (request_id={e.headers.get('aurous-request-id')})")
                raise
            except openai.PermissionDeniedError as e:
                # 403 — wrong scope
                self.log.error(f"Aurous 403: insufficient scope (request_id={e.headers.get('aurous-request-id')})")
                raise
            except openai.NotFoundError as e:
                # 404 — stop
                raise
            except openai.RateLimitError as e:
                # 429 — back off Retry-After
                retry_after = float(e.response.headers.get("retry-after", 2 ** attempt))
                self.log.warning(f"Aurous 429: sleeping {retry_after}s (request_id={e.response.headers.get('aurous-request-id')})")
                time.sleep(min(retry_after, 30))
            except openai.APIError as e:
                # 5xx — exponential backoff
                sleep_s = min(1.0 * (2 ** attempt) + random.random(), 32.0)
                self.log.warning(f"Aurous {e.status_code}: sleeping {sleep_s:.1f}s (request_id={e.response.headers.get('aurous-request-id')})")
                time.sleep(sleep_s)
        raise RuntimeError("Exhausted retries — investigate logs")

Where to next?

Errors index — every error code, every page
Idempotency — safe-retry contract for writes
Rate limits — RPM + TPM buckets, headers, Retry-After
How we count tokens — budget input tokens to avoid embeddings_input_too_large
Cost transparency — receipt math + reconciliation

Get started

Guides

Concepts

API Reference

Resources

The envelope

Taxonomy

Retry decision tree

`invalid_request` (400 / 422) — DO NOT RETRY

`authentication` (401 / 403) — STOP AND INVESTIGATE

`not_found` (404) — STOP

`rate_limit` (429) — BACK OFF + RETRY

`server_error` (500 / 502 / 503) — RETRY WITH JITTER

Standard headers on errors

Sample handler (production-grade)

Where to next?

​The envelope

​Taxonomy

​Retry decision tree

​invalid_request (400 / 422) — DO NOT RETRY

​authentication (401 / 403) — STOP AND INVESTIGATE

​not_found (404) — STOP

​rate_limit (429) — BACK OFF + RETRY

​server_error (500 / 502 / 503) — RETRY WITH JITTER

​Standard headers on errors

​Sample handler (production-grade)

​Where to next?

The envelope

Taxonomy

Retry decision tree

`invalid_request` (400 / 422) — DO NOT RETRY

`authentication` (401 / 403) — STOP AND INVESTIGATE

`not_found` (404) — STOP

`rate_limit` (429) — BACK OFF + RETRY

`server_error` (500 / 502 / 503) — RETRY WITH JITTER

Standard headers on errors

Sample handler (production-grade)

Where to next?