Skip to main content
Code: max_tokens_exceeds_hard_cap HTTP status: 400 Type: invalid_request

When it fires

The max_tokens field on your chat request is larger than the model’s aurous_metadata.max_output_tokens_hard_cap. The cap is a per-model ceiling that bounds output length regardless of context window.

How to recover

Read the model’s max_output_tokens_hard_cap from GET /v1/models and request a value at or below it. If you need more output than any current model supports, contact support.