Text to Speech
Generate expressive, studio-quality speech from text — buffered or streamed.
The /speech endpoint turns text into natural speech. Pick a model, a voice,
and an output format; receive audio back as a binary stream.
Basic request
curl https://api.vocenza.com/v1/speech \
-H "Authorization: Bearer $VOCENZA_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "vocenza-tts-1",
"voice": "aria",
"input": "The quick brown fox jumps over the lazy dog.",
"format": "mp3"
}' --output speech.mp3Parameters
| Parameter | Type | Description |
|---|---|---|
model | string | TTS model id. vocenza-tts-1 (quality) or vocenza-tts-1-flash (lowest latency). |
voice | string | Voice id, e.g. aria, atlas, nova. See GET /voices. |
input | string | The text to synthesize. Up to 5,000 characters per request. |
format | string | mp3, wav, opus, or pcm16. Defaults to mp3. |
speed | number | Playback rate from 0.5 to 2.0. Defaults to 1.0. |
sample_rate | number | Output sample rate in Hz (e.g. 24000). |
Streaming
For the lowest time-to-first-audio, stream the response and play chunks as they arrive instead of waiting for the full file.
const res = await fetch("https://api.vocenza.com/v1/speech", {
method: "POST",
headers: {
Authorization: `Bearer ${process.env.VOCENZA_API_KEY}`,
"Content-Type": "application/json",
},
body: JSON.stringify({
model: "vocenza-tts-1-flash",
voice: "aria",
input: "Streaming keeps latency low for live experiences.",
format: "pcm16",
}),
});
const reader = res.body!.getReader();
for (;;) {
const { done, value } = await reader.read();
if (done) break;
enqueueToAudioOutput(value); // your playback buffer
}Pick the right model
Use vocenza-tts-1-flash for interactive, latency-sensitive playback and
vocenza-tts-1 when you can buffer and want maximum fidelity.
Pronunciation control
Wrap text in SSML-style tags to fine-tune delivery:
<break time="400ms"/> Let me think about that.
<emphasis level="strong">Absolutely.</emphasis>Output formats
| Format | Container | Best for |
|---|---|---|
mp3 | MPEG | General playback, small files |
wav | RIFF | Editing, archival |
opus | Ogg | Streaming over the network |
pcm16 | raw | Realtime playback buffers |