provider_rate_limited
HTTP status: 503
Type: rate_limit
When it fires
The upstream model provider returned a throttle response while serving your chat completion or embedding. Distinct fromtoo_many_requests (which is the platform’s per-team RPM cap) and tpm_rate_limit_exceeded (per-team TPM cap) — this code means the upstream model itself is hot.
How to recover
Retry with exponential backoff. TheRetry-After header is forwarded from upstream when available — respect it if set. If you’re hitting this repeatedly, consider spreading load or reaching out to support.
