POST /v1/chat/completions accepts a response_format parameter that forces the model’s output to be valid JSON. Two modes are supported:
{ "type": "json_object" }— loose. Just guarantees the output parses as JSON. No schema enforcement.{ "type": "json_schema", "json_schema": {...} }— strict. The output must conform to the supplied JSON Schema, validated by the model itself.
json_object — loose JSON output
The simplest mode. Useful when the schema is implicit (the prompt tells the model what to produce) and you just need parseable JSON back.
json_object mode when the schema is loose, dynamic, or you want the model to choose its own keys.
json_schema — strict schema-enforced output
The model is constrained at decoding time to emit JSON that validates against your supplied schema. Useful when downstream code expects an exact shape — extracting fields from a document, classification, structured agent calls.
name field is required (it’s the schema’s identifier in your code; the platform doesn’t use it but the OpenAI-compat surface demands it). strict: true enables the model’s strict-mode decoder; strict: false (or omitting strict) reverts to best-effort schema adherence — the model will TRY to follow the schema but isn’t guaranteed.
Schema constraints
The supplied schema must obey a subset of JSON Schema:type:object,array,string,integer,number,boolean,null(any combination)properties: nested objects allowedrequired: required field namesadditionalProperties: false: recommended; omit to allow extra keysenum: allowed onstring,integer,numberitems: required onarraytypes; supports nested schemasanyOf/oneOf: supported on object properties$ref: supported within the samejson_schemaobjectdescription: optional; the model uses it as guidance
pattern(regex matching) — falls back to best-effort even in strict modeformat(date-time, email, etc.) — same; best-effortminimum/maximum/minLength/maxLength— same; best-effort- External
$refto remote schema URLs
400 response_format_too_deep or 400 response_format_too_large error.
Schema size + depth caps
- Maximum schema depth: 8 levels of nesting (root object = depth 1)
- Maximum total schema size: 128KB of JSON-encoded text
- Maximum properties per object: 128
- Maximum enum values per field: 256
Streaming + structured output
response_format works with stream: true. The model emits valid-JSON-shaped partial frames; the FULL JSON validates against the schema only after the final non-[DONE] chunk. If you want to parse incrementally as bytes arrive, use a streaming JSON parser (e.g., clarinet / oboe.js in Node, ijson in Python) — the partial frame text is sound enough for prefix parsing.
Tool calls vs structured output
Bothtools and response_format can be used in the same request. The semantics:
- If
tool_choiceresolves to a tool call, the model emits the tool call (not the structured response). - If the model produces an assistant message instead of a tool call, that message is constrained to the
response_format.
tools when the model should choose between calling a function or producing a structured answer, and response_format alone when the model must always produce JSON.
Error modes
| Condition | Status | Code |
|---|---|---|
| Schema exceeds depth cap | 400 | response_format_too_deep |
| Schema exceeds size cap | 400 | response_format_too_large |
Malformed json_schema block | 400 | DTO invalid_request |
| Strict mode requested but model emitted invalid JSON | 502 | chat_provider_unknown_error |
aurous-grow-2.0-pro + strict: true — the decoder is constrained at token-emission time, not validated post-hoc.
Where to next?
- Chat overview — the full chat surface
- Chat tools — function/tool calling
POST /v1/chat/completions— endpoint reference

