POST /speech

Synthesize speech from text and return an audio stream.

Creates speech audio from input text.

POST /v1/speech

Request body

Field	Type	Required	Description
`model`	string	yes	`vocenza-tts-1` or `vocenza-tts-1-flash`.
`voice`	string	yes	Voice id from `GET /voices`.
`input`	string	yes	Text to synthesize (≤ 5,000 chars).
`format`	string	no	`mp3`, `wav`, `opus`, `pcm16`. Default `mp3`.
`speed`	number	no	`0.5`–`2.0`. Default `1.0`.
`sample_rate`	number	no	Output sample rate in Hz.

Example

curl https://api.vocenza.com/v1/speech \
  -H "Authorization: Bearer $VOCENZA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{ "model": "vocenza-tts-1", "voice": "aria", "input": "Hello." }' \
  --output hello.mp3

Response

Returns the raw audio stream with the matching Content-Type (e.g. audio/mpeg). On error, returns a JSON error object.

Header	Description
`Content-Type`	Audio MIME type for the requested `format`.
`X-Request-Id`	Unique id for support and debugging.

Request body#

Example#

Response#

Request body

Example

Response