POST /speech

Synthesize speech from text and return an audio stream.

Creates speech audio from input text.

POST /v1/speech

Request body

FieldTypeRequiredDescription
modelstringyesvocenza-tts-1 or vocenza-tts-1-flash.
voicestringyesVoice id from GET /voices.
inputstringyesText to synthesize (≤ 5,000 chars).
formatstringnomp3, wav, opus, pcm16. Default mp3.
speednumberno0.52.0. Default 1.0.
sample_ratenumbernoOutput sample rate in Hz.

Example

curl https://api.vocenza.com/v1/speech \
  -H "Authorization: Bearer $VOCENZA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{ "model": "vocenza-tts-1", "voice": "aria", "input": "Hello." }' \
  --output hello.mp3

Response

Returns the raw audio stream with the matching Content-Type (e.g. audio/mpeg). On error, returns a JSON error object.

HeaderDescription
Content-TypeAudio MIME type for the requested format.
X-Request-IdUnique id for support and debugging.