Skip to main content
Aurous Labs is a single API for three modalities — chat completions, embeddings, and image generation — all OpenAI-shaped, all backed by per-token receipts and idempotent writes. Most developers point an existing OpenAI SDK at us and ship without rewriting a line. This page walks through one curl + one Node snippet + one Python snippet for each modality. Every example was run against https://api.preprod.aurous-labs.com and the output is shown verbatim.

1. Authenticate

Create a team in the dashboard and mint a key under Settings → API keys. The plaintext is shown once at creation — store it in a secret manager. Keys look like al_live_<64-hex>. Send the key in the X-Api-Key header, OR as Authorization: Bearer al_live_... (OpenAI SDKs do this automatically). A 2-second proof-of-life:
curl https://api.aurous-labs.com/v1/balance \
  -H "X-Api-Key: $AUROUS_API_KEY"
{
  "credits": 1023.49,
  "held_credits": 4,
  "available_credits": 1019.49,
  "currency": "credit"
}
If you see your balance, you’re in. If you see 401 invalid_api_key, double-check the header name and the value.

2. Chat completions

The chat surface is intentionally OpenAI-compatible. Most developers re-use the official OpenAI SDK with two lines changed — the baseURL and the apiKey.
curl -X POST https://api.aurous-labs.com/v1/chat/completions \
  -H "Authorization: Bearer $AUROUS_API_KEY" \
  -H "Content-Type: application/json" \
  -H "Idempotency-Key: $(uuidgen)" \
  -d '{
    "model": "aurous-grow-2.0-pro",
    "messages": [
      { "role": "user", "content": "In 1 short sentence: what is an embedding?" }
    ],
    "max_tokens": 64
  }'
The response carries the assistant’s message under choices[0].message.content, plus a usage block that includes the token counts and the credit charge. Every chat completion mints a unique cmp_<ulid> id you can fetch later with GET /v1/chat/completions/{id}.
  • Streaming: pass stream: true to receive SSE frames as they arrive. See Chat streaming.
  • Tool calling: pass tools + tool_choice for function calling. See Chat tools.
  • Structured output: pass response_format: { type: "json_schema", json_schema: {...} } for schema-enforced JSON output.
  • Multimodal: include image_url content parts in messages. See Chat multimodal.
  • Reasoning effort: set reasoning_effort: "high" for reasoning-capable models. See Chat reasoning.
For an OpenAI-SDK-only walkthrough of every chat feature, see Drop in for OpenAI.

3. Embeddings

POST /v1/embeddings turns text, images, and video into a single high-dimensional vector you can store, index, and compare. The same OpenAI SDK pattern works — change two lines and call client.embeddings.create.
curl -X POST https://api.aurous-labs.com/v1/embeddings \
  -H "Authorization: Bearer $AUROUS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "aurous-embed-vision-1.0",
    "input": "The quick brown fox jumps over the lazy dog."
  }'
The differentiator is multimodal in one call: pass a content-parts array with text + image + video parts, and you get back a single combined vector. See Multimodal embeddings for the full surface — it’s what makes Aurous useful for cross-modal search and recommendation.
cURL (text + image)
curl -X POST https://api.aurous-labs.com/v1/embeddings \
  -H "Authorization: Bearer $AUROUS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "aurous-embed-vision-1.0",
    "input": [
      { "type": "text", "text": "Photo of a golden retriever in a park" },
      { "type": "image_url", "image_url": { "url": "https://example.com/dog.jpg" } }
    ]
  }'
The OpenAI SDK does NOT accept the array form natively (input: [...] is a string array in OpenAI’s TypeScript types). To send multimodal input via the OpenAI SDK, use the lower-level client.post() escape hatch — see Multimodal embeddings. For a deeper note on why input: ["a","b","c"] is not supported, see OpenAI batch incompat. Want to know the cost without billing? POST /v1/embeddings/estimate returns the same response shape minus the vector, no credits charged.

4. Image generation

POST /v1/images is the image-generation surface — pick a LoRA from GET /v1/loras (or omit lora_id and let our dispatcher pick a style based on your prompt), wait ~10-30 seconds, fetch the bytes.
# 4a — estimate cost without committing
curl -X POST https://api.aurous-labs.com/v1/images/estimate \
  -H "X-Api-Key: $AUROUS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{ "prompt": "A confident woman in lace lingerie, soft studio lighting, cinematic", "size": "2k_1_1" }'

# 4b — actually generate
curl -X POST https://api.aurous-labs.com/v1/images \
  -H "X-Api-Key: $AUROUS_API_KEY" \
  -H "Content-Type: application/json" \
  -H "Idempotency-Key: $(uuidgen)" \
  -d '{
    "prompt": "A confident woman in lace lingerie, soft studio lighting, cinematic",
    "size": "2k_1_1"
  }'
Image generation is asynchronous. The initial response carries the row in pending (or processing) state. Poll GET /v1/images/{id} until status is one of succeeded, failed, cancelled, expired, or moderation_rejected — most generations finish in 10-30 seconds.
# Replace img_<ulid> with the id returned by the POST /v1/images call above.
curl https://api.aurous-labs.com/v1/images/img_<replace-with-id-from-create> \
  -H "X-Api-Key: $AUROUS_API_KEY"
For production, skip polling and register a webhook — we POST a signed payload the moment the row reaches a terminal status. See Webhooks. output_urls[0] is an anonymous-read proxy URL on api.aurous-labs.com. Fetch it directly — no auth header needed. Image output URLs expire ~7 days after generation (410 Gone with code: output_expired after that); video output URLs expire ~24 hours after generation. Save what you want to keep.

Response headers worth knowing

Every response carries:
  • Aurous-Request-Id: req_<26-char ULID> — quote this in any support ticket so we can find your request.
  • Aurous-Version: YYYY-MM-DD — the API version applied to this response. Pin a specific version with the Aurous-Version request header to insulate yourself from future changes.
  • X-RateLimit-Limit / X-RateLimit-Remaining / X-RateLimit-Reset — see Rate limits.
  • On POST /v1/chat/completions and POST /v1/embeddings: X-RateLimit-TPM-Limit / X-RateLimit-TPM-Remaining / X-RateLimit-TPM-Reset — a separate tokens-per-minute bucket on top of the standard RPM bucket.
  • On idempotent writes (chat, embeddings, images, videos): Aurous-Idempotent-Replayed: true when the response is a replay of a prior successful call with the same Idempotency-Key.

Where to next?

  • Authentication — key formats, scopes, rotation, the 24h grace window
  • Errors — every type, every code, what to do
  • Idempotency — safe retries on POST /v1/chat/completions, /v1/embeddings, /v1/images, /v1/videos
  • Rate limits — per-team RPM + per-team TPM buckets, headers, what 429 means
  • Webhooks — signed deliveries, retries, secret rotation
  • API reference — every endpoint, every parameter