POST /speech
Synthesize speech from text and return an audio stream.
Creates speech audio from input text.
POST /v1/speechRequest body
| Field | Type | Required | Description |
|---|---|---|---|
model | string | yes | vocenza-tts-1 or vocenza-tts-1-flash. |
voice | string | yes | Voice id from GET /voices. |
input | string | yes | Text to synthesize (≤ 5,000 chars). |
format | string | no | mp3, wav, opus, pcm16. Default mp3. |
speed | number | no | 0.5–2.0. Default 1.0. |
sample_rate | number | no | Output sample rate in Hz. |
Example
curl https://api.vocenza.com/v1/speech \
-H "Authorization: Bearer $VOCENZA_API_KEY" \
-H "Content-Type: application/json" \
-d '{ "model": "vocenza-tts-1", "voice": "aria", "input": "Hello." }' \
--output hello.mp3Response
Returns the raw audio stream with the matching Content-Type (e.g.
audio/mpeg). On error, returns a JSON error object.
| Header | Description |
|---|---|
Content-Type | Audio MIME type for the requested format. |
X-Request-Id | Unique id for support and debugging. |