Create an image
Submit an image generation. Optionally anchor identity with a character.
POST /v1/images submits an image generation request. Credits are deducted immediately from your team balance; the generation is processed asynchronously. Poll GET /v1/images/{id} for status, or pass webhook_url for a push callback when the generation completes or fails.
For a step-by-step walkthrough, see the Quickstart. The full request shape, including all generation parameters, is in the playground below.
Using a LoRA
Pick a LoRA viaGET /v1/loras and pass its id (opaque lora_* or slug) as lora_id. Or omit lora_id and the platform’s dispatcher picks a style based on your prompt. If no clear match exists, the prompt is generated without a style.
Using a character
Whencharacter_id is set, the platform attaches the character’s saved reference images to the generation as visual anchors for identity consistency. The dispatch path is image-to-image, so denoise_strength becomes effective and influences how closely the output follows the refs vs the prompt.
The character must be in status: ready. Use a synthesizing / reviewing / failed character, or a soft-deleted one, and the request returns 400 character_not_ready.
character_id and reference_image_urls are mutually exclusive. Sending
both returns 400 mutually_exclusive_input. Pick one path per generation.Size
Specify image dimensions one of two ways — never both: Named preset viasize:
| Tier | Available aspect ratios |
|---|---|
2k | 1:1, 3:2, 2:3, 4:3, 3:4, 16:9, 9:16, 21:9 |
4k | 1:1, 3:2, 2:3, 4:3, 3:4, 16:9, 9:16, 21:9 |
<tier>_<ratio> form, e.g. 2k_16_9, 4k_1_1, 2k_2_3.
Custom dimensions via width and height:
- Both required when used.
- Range
[1024, 4096]per side. - Snapped server-side to the nearest multiple of 32 — the response
width/heightreflect the post-snap value.
size and width/height returns 400 parameter_invalid_combination. Sending only one of width/height returns 400 missing_field.
Idempotency
PassIdempotency-Key (any opaque value, 1–256 chars; UUID v4 recommended). Same key + same body within 24h replays the cached response with Aurous-Idempotent-Replayed: true. Same key + different body returns 409 idempotency_key_in_use. The 24h window and 1–256 char bound are documented in Idempotency.
Webhooks
Providewebhook_url to receive a POST callback when the generation reaches a terminal state. The payload is { event: "image.completed" | "image.failed", data: {...} } where data matches the GET /v1/images/{id} response. See Webhooks for signature verification.Authorizations
Your team API key (starts with al_live_).
Headers
Stripe-style idempotency key (1-256 chars). Same key + same canonical-JSON body returns the cached response with Aurous-Idempotent-Replayed: true. Same key + different body returns 409 invalid_request / idempotency_key_in_use. UUID v4 recommended. Replay window is 24 hours. Absent header is treated as non-idempotent (each call processes anew).
Optional API version pin (YYYY-MM-DD). Defaults to your team's pinned version, or the system default 2026-05-15 for unauthenticated requests.
^\d{4}-\d{2}-\d{2}$"2026-05-15"
Body
The text prompt describing the image to generate. 1-4000 characters; whitespace-only is rejected.
1 - 4000"A golden sunset over mountains, cinematic lighting, 8k resolution"
Negative prompt - elements to exclude from the generated image. 1-4000 characters.
1 - 4000"blurry, low quality, watermark, text"
Optional. Opaque LoRA identifier (lora_*) or URL-friendly slug, from GET /v1/loras. UUIDs are also accepted for legacy back-compat. If omitted, the platform will choose a suitable style based on your prompt; if none matches, your prompt is generated without a style.
"lora_01HXMQ7Z3K8Y2ABCDEFGHJKM"
Optional character ID (char_<ulid> from POST /v1/characters; UUID also accepted for legacy back-compat). When set, the character's reference images are sent to the model as visual anchors for identity consistency. The character must be in status: ready — referencing a synthesizing / reviewing / failed character returns 400 character_not_ready. Cross-team character_ids return 404 (existence is never leaked). Mutually exclusive with reference_image_urls: sending both returns 400 mutually_exclusive_input.
"char_01HXMQ7Z3K8Y2ABCDEFGHJKM"
Custom output image width in pixels. Use with height OR use size (preset), not both. Range [1024, 4096]; snapped server-side to the nearest multiple of 32. Sending both size and custom dimensions returns 400 with code parameter_invalid_combination. Sending only one of width/height returns 400 with code missing_field.
1024 <= x <= 40962048
Custom output image height in pixels. Use with width OR use size (preset), not both. Range [1024, 4096]; snapped server-side to the nearest multiple of 32. Sending both size and custom dimensions returns 400 with code parameter_invalid_combination. Sending only one of width/height returns 400 with code missing_field.
1024 <= x <= 40962048
Number of diffusion steps (higher = more detail, slower). If omitted, resolves to the default for the selected style, then a platform default of 11. An explicit value always takes precedence.
1 <= x <= 10030
Guidance scale - how closely to follow the prompt (higher = more literal). If omitted, resolves to the default for the selected style, then a platform default of 4.3. An explicit value always takes precedence.
0.1 <= x <= 307.5
CFG rescale factor — dampens classifier-free guidance to reduce burn/oversaturation at high guidance_scale values. Range 0.0-1.0. If omitted, resolves to the default for the selected style, then a platform default of 0.7. Applies to every image generation regardless of references. An explicit value always takes precedence.
0 <= x <= 10.7
Denoising strength applied when reference images or a character are attached. Range 0.0-1.0. If omitted on a request that has references, resolves to the style or character consistency default, then a platform default of 0.6. Ignored (and never defaulted) on bare text-to-image requests with no references. An explicit value always takes precedence.
0 <= x <= 10.6
Random seed for reproducible generations. Omit to let the model pick a random seed — the concrete value it used is returned as seed on the succeeded generation, so you can pass that value back here to reproduce the result.
42
Image size as a named preset. Use this OR custom width/height, not both. Format is <tier>_<ratio> where tier is 2k or 4k and ratio matches the supported aspect-ratio set. Sending both size and custom dimensions returns 400 with code parameter_invalid_combination.
2k_1_1, 2k_3_2, 2k_2_3, 2k_4_3, 2k_3_4, 2k_16_9, 2k_9_16, 2k_21_9, 4k_1_1, 4k_3_2, 4k_2_3, 4k_4_3, 4k_3_4, 4k_16_9, 4k_9_16, 4k_21_9 "2k_1_1"
Number of images to generate in this request
1 <= x <= 41
When true, an LLM rewrites your prompt before generation using the LoRA's style template. This is the only customer-facing prompt-shaping toggle in the public API. Pricing: enhanced generations cost a configurable multiplier of the base rate.
false
Optional webhook URL. When provided, a POST request will be sent to this URL when the generation completes or fails. The payload contains an event field ("image.completed" or "image.failed") and a data field with the generation details (same shape as GET /v1/images/:id). Delivery is attempted up to 3 times with a 2-second delay between retries.
"https://your-server.com/webhooks/aurouslabs"
Up to 6 reference images. Each entry can be either:
- an opaque file ID
file_<ulid>returned byPOST /v1/files, or - an
https://URL pointing at a public host (max 2048 chars). URLs are server-side fetched through an SSRF-pinned client (rejects private IPs / loopback / link-local / cloud metadata) and materialized as a 24h-TTL file under your team. Pricing matches the reference-image rate (see Pricing). Empty array or omitted is treated as "no references". Mutually exclusive withcharacter_id— sending both returns 400mutually_exclusive_input.
6[
"file_01HXMQ7Z3K8Y2NABCDEFGHJKMN",
"https://example.com/ref2.jpg"
]
Response
Generation created and pending processing
Discriminator — always inference. Mirrors OpenAI's object-field convention so SDK clients can branch on the resource type without inspecting the ID prefix. A single canonical value (inference) covers both image and video generations; use media_type to distinguish the rendering kind.
inference "inference"
Opaque generation ID
"img_01HXMQ7Z3K8Y2ABCDEFGHJKM"
Current generation status. Lifecycle: pending (created, awaiting dispatch) → processing (running) → one of the terminal values succeeded / failed / cancelled. Additional terminal values may be introduced in future API versions and will be announced via the changelog before they appear on the wire.
pending, processing, succeeded, failed, cancelled "succeeded"
The text prompt used for generation
"A golden sunset over mountains, cinematic lighting"
Creation timestamp (ISO 8601)
"2026-05-04T10:00:00Z"
Distinguishes image vs video generation. May be null for older rows minted before this column existed.
image, video "image"
Negative prompt to exclude from generation
"blurry, low quality"
Generated image proxy URLs. Each URL is anonymous-read (no auth header required) and edge-cached for 24 hours. Available for ~24 hours after generation. Save what you want to keep — long-term storage is intentionally not part of the platform. URLs return 410 Gone after expiry.
[
"https://api.aurous-labs.com/v1/images/img_01HXMQ7Z3K8Y2ABCDEFGHJKM/output/0"
]
Generated video proxy URL (only present on media_type: video). Same 24h TTL as image output_urls.
"https://api.aurous-labs.com/v1/videos/vid_01HXMQ7Z3K8Y2ABCDEFGHJKM/output?token=..."
Reference image URLs that were used as visual anchors for this generation, if any. Snapshotted at inference time — for character-attached generations, these are the resolved character refs at submission, not the live character state.
["https://example.com/ref1.jpg"]
Error message if the generation failed
"Content policy violation"
Processing duration in milliseconds (set on terminal status)
14820
Per-generation cost breakdown — same shape as the estimated_cost returned by POST /v1/{images,videos}/estimate. May be null for older rows from before this field existed; populated for all new generations. The amount reflects the committed charge for terminal-status rows.
{
"amount": 2,
"currency": "credit",
"breakdown": { "base": 1, "enhance": 1 }
}
Resolved output image width in pixels (image generations only). Reflects the post-snap dimension actually generated; may differ from a custom-requested width by up to 31 px due to multiple-of-32 snapping.
2048
Resolved output image height in pixels (image generations only). Reflects the post-snap dimension actually generated; may differ from a custom-requested height by up to 31 px due to multiple-of-32 snapping.
2048
Number of images requested in the batch
1
Named size preset applied to this generation. null when the request used custom width/height instead of a preset.
2k_1_1, 2k_3_2, 2k_2_3, 2k_4_3, 2k_3_4, 2k_16_9, 2k_9_16, 2k_21_9, 4k_1_1, 4k_3_2, 4k_2_3, 4k_4_3, 4k_3_4, 4k_16_9, 4k_9_16, 4k_21_9 "2k_1_1"
Inference mode dispatched. t2i (text-to-image) for every current image generation — reference images and characters are supplementary inputs to the t2i flow, not a separate mode. i2i is reserved for a future image-edit endpoint.
t2i, i2i "t2i"
CFG rescale factor the customer supplied on the request body, echoed back here. Range 0.0-1.0. Omitted when the customer did not supply a per-request value (the platform applied a precedence-chain default — LoRA, character override, or the global 0.7 — which is not exposed on the response).
0.7
Denoising strength the customer supplied on the request body, echoed back here. Range 0.0-1.0. Omitted when the customer did not supply a value or when the generation was a bare text-to-image request (denoise is only applied when reference images or a character are attached).
0.6
The random seed the model actually used for this image generation. Populated even when you omit seed on the request — the platform requests a random seed and records the concrete value the provider rolled, so you can reproduce the result by passing it back as seed. Available once status is succeeded; null before then and for failed/cancelled generations. For multi-image batches (image_count > 1) this is the seed of the first image (output_urls[0]); per-image seeds are not yet exposed. Image generations only.
819572108
Video duration in seconds (video generations only)
5
Video resolution (video generations only)
480p, 720p, 1080p "480p"
Video aspect ratio (video generations only)
16:9, 4:3, 1:1, 3:4, 9:16, 21:9 "16:9"
Whether the camera was held fixed during the video
Character ID supplied on the request (char_<ulid> or legacy UUID), echoed back. null when no character was attached to this generation.
"char_01HXMQ7Z3K8Y2ABCDEFGHJKM"
LoRAs applied to this generation. null for prompt-only and pure-reference generations.
API contract version applied at the time this row was minted (D25 — frozen for replay across future version bumps).
"2026-05-15"
Aurous-Request-Id of the POST that created this row. Quote in support tickets to trace the original create request.
"req_01HXMQ7Z3K8Y2ABCDEFGHJKM"
Terminal-status timestamp (ISO 8601). NULL until the generation reaches a terminal state.
"2026-05-04T10:00:14Z"

