Rate Limits
How limits are measured, the headers we return, and how to handle 429s.
Rate limits protect platform stability and are scoped per project. Limits differ by endpoint because audio workloads are measured differently from text.
How limits are measured
| Endpoint | Limited by |
|---|---|
| Text to Speech | Characters per minute & requests per minute |
| Speech to Text | Audio seconds per minute & requests per minute |
| Realtime API | Concurrent sessions & audio minutes |
| LLM Router | Tokens per minute & requests per minute |
Response headers
Every response includes your current standing so you can pace clients
proactively rather than waiting for a 429.
RateLimit-Limit: 600
RateLimit-Remaining: 581
RateLimit-Reset: 23| Header | Meaning |
|---|---|
RateLimit-Limit | Maximum requests in the current window. |
RateLimit-Remaining | Requests left in the current window. |
RateLimit-Reset | Seconds until the window resets. |
Handling 429s
When you exceed a limit you'll receive a 429 with a Retry-After header. Back
off and retry — see the retry helper.
Raising limits
Need more throughput? Higher limits are available on paid plans. See Pricing or reach out from the Playground.