Rate Limits

How limits are measured, the headers we return, and how to handle 429s.

Rate limits protect platform stability and are scoped per project. Limits differ by endpoint because audio workloads are measured differently from text.

How limits are measured

Endpoint	Limited by
Text to Speech	Characters per minute & requests per minute
Speech to Text	Audio seconds per minute & requests per minute
Realtime API	Concurrent sessions & audio minutes
LLM Router	Tokens per minute & requests per minute

Response headers

Every response includes your current standing so you can pace clients proactively rather than waiting for a 429.

RateLimit-Limit: 600
RateLimit-Remaining: 581
RateLimit-Reset: 23

Header	Meaning
`RateLimit-Limit`	Maximum requests in the current window.
`RateLimit-Remaining`	Requests left in the current window.
`RateLimit-Reset`	Seconds until the window resets.

Handling 429s

When you exceed a limit you'll receive a 429 with a Retry-After header. Back off and retry — see the retry helper.

Raising limits

Need more throughput? Higher limits are available on paid plans. See Pricing or reach out from the Playground.

How limits are measured#

Response headers#

Handling 429s#

How limits are measured

Response headers

Handling 429s