Embedding limits

POST /v1/embeddings enforces a small set of input caps to keep latency predictable and prevent abuse. Hitting a cap returns a 400 invalid_request with a specific error code; this page enumerates each cap, its trigger, and the error code you’ll see.

Caps at a glance

Limit	Cap	Error code
Total content parts per request	16	`embeddings_input_too_many_items`
Image parts per request	8	`embeddings_input_too_many_items`
Video parts per request	0 (rejected)	`embeddings_video_unsupported`
Text characters per text part	1,000,000 (1M)	DTO `invalid_request` (max-length validator)
URL length (image_url)	2,048 chars	DTO `invalid_request` (max-length validator)
Aggregate input tokens (text + visual + video)	128K	`embeddings_input_too_large`
`string[]` batch input	Not accepted	`embeddings_batch_not_supported`

The model context window is 128K input tokens. The character caps above are guardrails on individual parts; the aggregate token count is the real constraint.

Total content parts (≤ 16)

input accepts either a string (input: "...") or an array of content parts (input: [{type:"text", text:"..."}, {type:"image_url", image_url:{url:"..."}}, ...]). The array form is capped at 16 total parts across all types.

# 16 text parts — accepted
curl -X POST https://api.aurous-labs.com/v1/embeddings \
  -H "X-Api-Key: $AUROUS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{ "model": "aurous-embed-vision-1.0", "input": [
    {"type": "text", "text": "part 1"},
    {"type": "text", "text": "part 2"},
    ... (14 more) ...
  ] }'

# 17+ parts — rejected
{
  "error": {
    "type": "invalid_request",
    "code": "embeddings_input_too_many_items",
    "message": "Embedding request has 17 content parts; the cap is 16 total parts per request. See https://docs.aurous-labs.com/api-reference/embeddings/limits.",
    "doc_url": "https://docs.aurous-labs.com/errors#embeddings_input_too_many_items"
  }
}

Image parts (≤ 8)

Within the 16-part total, image parts are capped at 8. The cap exists because each image contributes ~1,000-1,500 visual tokens to the input — packing more than 8 images in one request risks blowing the 128K context window mid-request, which surfaces as a noisy embeddings_input_too_large rather than a clean image-cap error.

# 8 images — accepted
{
  "model": "aurous-embed-vision-1.0",
  "input": [
    {"type": "image_url", "image_url": {"url": "https://example.com/1.jpg"}},
    ... (7 more) ...
  ]
}

# 9 images — rejected with embeddings_input_too_many_items

If you need to embed 16 images, send two requests of 8 each (loop pattern). See OpenAI batch incompat for the loop guidance.

Video parts (rejected, 2026-05-24)

video_url parts are no longer accepted on the v1 embeddings surface. Submitting one returns embeddings_video_unsupported. The provider folds video frames into the visual billing bucket — the previously published video rate never actually fired — so we removed the input shape entirely. Extract a representative frame in your pipeline and submit it as image_url; it bills at the visual rate.

// Any video_url part — rejected
{
  "error": {
    "code": "embeddings_video_unsupported",
    "message": "Video input is not supported on v1 embeddings. Submit text or image_url parts; the visual rate applies to image inputs."
  }
}

Per-part text length (≤ 1,000,000 chars)

Each text part is capped at 1,000,000 characters. This is a guardrail against runaway input; in practice the 128K-token aggregate kicks in first for English text (~4 chars/token average → ~512K chars max), but the per-part cap exists to bound a single malformed part. Exceeding 1,000,000 characters returns a DTO-level 400 invalid_request from the validation layer (not a specific embeddings_* code) — the request never reaches the embedding service.

URL length (≤ 2,048 chars)

image_url.url and video_url.url are capped at 2,048 characters. URLs longer than that are rejected at the DTO layer with 400 invalid_request. The error message currently surfaces the generic content-parts validator hint rather than naming the URL-length cap specifically; a more pointed error message is tracked for v1.0.x. This is a sane guard against signed URLs with arbitrarily-long query strings; production CDN URLs are typically <1,000 chars and rarely come close.

Aggregate input tokens (≤ 128K)

The hard limit is the model’s context window. Sum the tokens across all parts:

Text: count via the BytePlus tokenizer (see how-we-count-tokens)
Image: ~1,000-1,500 visual tokens per typical image
Video: ~100-200 visual tokens per second of video

If the sum exceeds 128K, the call returns embeddings_input_too_large:

{
  "error": {
    "type": "invalid_request",
    "code": "embeddings_input_too_large",
    "message": "Embedding input is approximately 147,392 tokens, exceeding the 128,000-token context window of aurous-embed-vision-1.0. Trim text parts, reduce image count, or use a shorter video clip. See https://docs.aurous-labs.com/api-reference/embeddings/limits.",
    "doc_url": "https://docs.aurous-labs.com/errors#embeddings_input_too_large"
  }
}

To preview the count without billing, use POST /v1/embeddings/estimate — same body, returns the token breakdown + the credit charge.

`string[]` batch input

OpenAI accepts input: ["a", "b", "c"] as a batched N-vector return. Aurous does NOT. See OpenAI batch incompat for the rationale and workaround.

Server-side URL fetching constraints

image_url.url and video_url.url are fetched server-side at request time. Several schemes / hosts are blocked:

HTTPS-only — http://, ftp://, data:, file://, gopher:// all rejected
RFC1918 addresses (10.x, 172.16-31.x, 192.168.x) blocked
Loopback (127.0.0.1, ::1) blocked
Link-local (169.254.x, fe80::/10) blocked
Cloud metadata (169.254.169.254, metadata.google.internal) blocked

See URL fetching for the full set of guards and what happens when a URL 404s or times out (10-second fetch timeout).

Where to next?

Multimodal embeddings — the full content-parts surface
URL fetching — what happens server-side when we fetch an image/video URL
POST /v1/embeddings/estimate — preview cost + token counts
How we count tokens — per-modality tokenization details
Error codes — full taxonomy

Get started

Guides

Concepts

API Reference

Resources

Caps at a glance

Total content parts (≤ 16)

Image parts (≤ 8)

Video parts (rejected, 2026-05-24)

Per-part text length (≤ 1,000,000 chars)

URL length (≤ 2,048 chars)

Aggregate input tokens (≤ 128K)

`string[]` batch input

Server-side URL fetching constraints

Where to next?

​Caps at a glance

​Total content parts (≤ 16)

​Image parts (≤ 8)

​Video parts (rejected, 2026-05-24)

​Per-part text length (≤ 1,000,000 chars)

​URL length (≤ 2,048 chars)

​Aggregate input tokens (≤ 128K)

​string[] batch input

​Server-side URL fetching constraints

​Where to next?

Caps at a glance

Total content parts (≤ 16)

Image parts (≤ 8)

Video parts (rejected, 2026-05-24)

Per-part text length (≤ 1,000,000 chars)

URL length (≤ 2,048 chars)

Aggregate input tokens (≤ 128K)

`string[]` batch input

Server-side URL fetching constraints

Where to next?