POST /v1/embeddings enforces a small set of input caps to keep latency predictable and prevent abuse. Hitting a cap returns a 400 invalid_request with a specific error code; this page enumerates each cap, its trigger, and the error code you’ll see.
Caps at a glance
| Limit | Cap | Error code |
|---|---|---|
| Total content parts per request | 16 | embeddings_input_too_many_items |
| Image parts per request | 8 | embeddings_input_too_many_items |
| Video parts per request | 0 (rejected) | embeddings_video_unsupported |
| Text characters per text part | 1,000,000 (1M) | DTO invalid_request (max-length validator) |
| URL length (image_url) | 2,048 chars | DTO invalid_request (max-length validator) |
| Aggregate input tokens (text + visual + video) | 128K | embeddings_input_too_large |
string[] batch input | Not accepted | embeddings_batch_not_supported |
Total content parts (≤ 16)
input accepts either a string (input: "...") or an array of content parts (input: [{type:"text", text:"..."}, {type:"image_url", image_url:{url:"..."}}, ...]). The array form is capped at 16 total parts across all types.
Image parts (≤ 8)
Within the 16-part total, image parts are capped at 8. The cap exists because each image contributes ~1,000-1,500 visual tokens to the input — packing more than 8 images in one request risks blowing the 128K context window mid-request, which surfaces as a noisyembeddings_input_too_large rather than a clean image-cap error.
Video parts (rejected, 2026-05-24)
video_url parts are no longer accepted on the v1 embeddings surface. Submitting one returns embeddings_video_unsupported. The provider folds video frames into the visual billing bucket — the previously published video rate never actually fired — so we removed the input shape entirely. Extract a representative frame in your pipeline and submit it as image_url; it bills at the visual rate.
Per-part text length (≤ 1,000,000 chars)
Each text part is capped at 1,000,000 characters. This is a guardrail against runaway input; in practice the 128K-token aggregate kicks in first for English text (~4 chars/token average → ~512K chars max), but the per-part cap exists to bound a single malformed part. Exceeding 1,000,000 characters returns a DTO-level400 invalid_request from the validation layer (not a specific embeddings_* code) — the request never reaches the embedding service.
URL length (≤ 2,048 chars)
image_url.url and video_url.url are capped at 2,048 characters. URLs longer than that are rejected at the DTO layer with 400 invalid_request. The error message currently surfaces the generic content-parts validator hint rather than naming the URL-length cap specifically; a more pointed error message is tracked for v1.0.x.
This is a sane guard against signed URLs with arbitrarily-long query strings; production CDN URLs are typically <1,000 chars and rarely come close.
Aggregate input tokens (≤ 128K)
The hard limit is the model’s context window. Sum the tokens across all parts:- Text: count via the BytePlus tokenizer (see how-we-count-tokens)
- Image: ~1,000-1,500 visual tokens per typical image
- Video: ~100-200 visual tokens per second of video
embeddings_input_too_large:
POST /v1/embeddings/estimate — same body, returns the token breakdown + the credit charge.
string[] batch input
OpenAI accepts input: ["a", "b", "c"] as a batched N-vector return. Aurous does NOT. See OpenAI batch incompat for the rationale and workaround.
Server-side URL fetching constraints
image_url.url and video_url.url are fetched server-side at request time. Several schemes / hosts are blocked:
- HTTPS-only —
http://,ftp://,data:,file://,gopher://all rejected - RFC1918 addresses (10.x, 172.16-31.x, 192.168.x) blocked
- Loopback (127.0.0.1, ::1) blocked
- Link-local (169.254.x, fe80::/10) blocked
- Cloud metadata (169.254.169.254, metadata.google.internal) blocked
Where to next?
- Multimodal embeddings — the full content-parts surface
- URL fetching — what happens server-side when we fetch an image/video URL
POST /v1/embeddings/estimate— preview cost + token counts- How we count tokens — per-modality tokenization details
- Error codes — full taxonomy

