Rate Limits

How limits are measured, the headers we return, and how to handle 429s.

Rate limits protect platform stability and are scoped per project. Limits differ by endpoint because audio workloads are measured differently from text.

How limits are measured

EndpointLimited by
Text to SpeechCharacters per minute & requests per minute
Speech to TextAudio seconds per minute & requests per minute
Realtime APIConcurrent sessions & audio minutes
LLM RouterTokens per minute & requests per minute

Response headers

Every response includes your current standing so you can pace clients proactively rather than waiting for a 429.

RateLimit-Limit: 600
RateLimit-Remaining: 581
RateLimit-Reset: 23
HeaderMeaning
RateLimit-LimitMaximum requests in the current window.
RateLimit-RemainingRequests left in the current window.
RateLimit-ResetSeconds until the window resets.

Handling 429s

When you exceed a limit you'll receive a 429 with a Retry-After header. Back off and retry — see the retry helper.

Raising limits

Need more throughput? Higher limits are available on paid plans. See Pricing or reach out from the Playground.