Skip to main content
POST /v1/chat/completions accepts a response_format parameter that forces the model’s output to be valid JSON. Two modes are supported:
  • { "type": "json_object" } — loose. Just guarantees the output parses as JSON. No schema enforcement.
  • { "type": "json_schema", "json_schema": {...} } — strict. The output must conform to the supplied JSON Schema, validated by the model itself.
Both modes follow OpenAI’s shape, so the OpenAI SDK works without modification.

json_object — loose JSON output

The simplest mode. Useful when the schema is implicit (the prompt tells the model what to produce) and you just need parseable JSON back.
const completion = await client.chat.completions.create({
  model: "aurous-grow-2.0-pro",
  messages: [
    { role: "system", content: "Respond with JSON only. Use the structure { 'city': string, 'capital': string }." },
    { role: "user", content: "France" },
  ],
  response_format: { type: "json_object" },
  max_tokens: 100,
});

const parsed = JSON.parse(completion.choices[0].message.content!);
// { city: "France", capital: "Paris" }
The model is constrained to emit syntactically-valid JSON, but the schema is up to you to communicate in the prompt. Use json_object mode when the schema is loose, dynamic, or you want the model to choose its own keys.

json_schema — strict schema-enforced output

The model is constrained at decoding time to emit JSON that validates against your supplied schema. Useful when downstream code expects an exact shape — extracting fields from a document, classification, structured agent calls.
const completion = await client.chat.completions.create({
  model: "aurous-grow-2.0-pro",
  messages: [
    { role: "user", content: "Extract: 'Jane Doe lives in Berlin and is 34 years old.'" },
  ],
  response_format: {
    type: "json_schema",
    json_schema: {
      name: "person",
      strict: true,
      schema: {
        type: "object",
        properties: {
          name: { type: "string" },
          city: { type: "string" },
          age: { type: "integer" },
        },
        required: ["name", "city", "age"],
        additionalProperties: false,
      },
    },
  },
  max_tokens: 200,
});

const person = JSON.parse(completion.choices[0].message.content!);
// { name: "Jane Doe", city: "Berlin", age: 34 }
The name field is required (it’s the schema’s identifier in your code; the platform doesn’t use it but the OpenAI-compat surface demands it). strict: true enables the model’s strict-mode decoder; strict: false (or omitting strict) reverts to best-effort schema adherence — the model will TRY to follow the schema but isn’t guaranteed.

Schema constraints

The supplied schema must obey a subset of JSON Schema:
  • type: object, array, string, integer, number, boolean, null (any combination)
  • properties: nested objects allowed
  • required: required field names
  • additionalProperties: false: recommended; omit to allow extra keys
  • enum: allowed on string, integer, number
  • items: required on array types; supports nested schemas
  • anyOf / oneOf: supported on object properties
  • $ref: supported within the same json_schema object
  • description: optional; the model uses it as guidance
Constraints not supported:
  • pattern (regex matching) — falls back to best-effort even in strict mode
  • format (date-time, email, etc.) — same; best-effort
  • minimum / maximum / minLength / maxLength — same; best-effort
  • External $ref to remote schema URLs
If your schema exceeds the platform’s depth or size guard, you’ll get a 400 response_format_too_deep or 400 response_format_too_large error.

Schema size + depth caps

  • Maximum schema depth: 8 levels of nesting (root object = depth 1)
  • Maximum total schema size: 128KB of JSON-encoded text
  • Maximum properties per object: 128
  • Maximum enum values per field: 256
These caps exist to prevent the validator from doing pathologically slow work on adversarial input. Real-world schemas don’t come close.

Streaming + structured output

response_format works with stream: true. The model emits valid-JSON-shaped partial frames; the FULL JSON validates against the schema only after the final non-[DONE] chunk. If you want to parse incrementally as bytes arrive, use a streaming JSON parser (e.g., clarinet / oboe.js in Node, ijson in Python) — the partial frame text is sound enough for prefix parsing.

Tool calls vs structured output

Both tools and response_format can be used in the same request. The semantics:
  • If tool_choice resolves to a tool call, the model emits the tool call (not the structured response).
  • If the model produces an assistant message instead of a tool call, that message is constrained to the response_format.
In practice, use tools when the model should choose between calling a function or producing a structured answer, and response_format alone when the model must always produce JSON.

Error modes

ConditionStatusCode
Schema exceeds depth cap400response_format_too_deep
Schema exceeds size cap400response_format_too_large
Malformed json_schema block400DTO invalid_request
Strict mode requested but model emitted invalid JSON502chat_provider_unknown_error
The last case is rare in practice with aurous-grow-2.0-pro + strict: true — the decoder is constrained at token-emission time, not validated post-hoc.

Where to next?