Introduction

Vocenza is a realtime voice AI platform — text-to-speech, speech-to-text, full-duplex conversation, and LLM routing behind one API.

Vocenza gives developers a single, low-latency API for building voice into any product: expressive text-to-speech, accurate speech-to-text, a full-duplex Realtime API for natural conversations, and an LLM Router that picks the best model for every turn.

New here?

Jump straight to the Quickstart to make your first request in under a minute, then explore the model guides below.

What you can build

Voice agents — full-duplex conversations with barge-in and sub-300ms response times via the Realtime API.
Narration & media — stream studio-quality speech from text with the Text to Speech endpoint.
Transcription — turn audio and live streams into accurate, timestamped text with Speech to Text.
Smart routing — send a prompt and let the LLM Router choose the fastest, cheapest model that meets your quality bar.

Conventions

Base URL: https://api.vocenza.com/v1
JSON in, JSON out (UTF-8). Audio is sent and returned as binary or base64.
Authenticate with Authorization: Bearer voc_… — never as a query string.
Conventional HTTP status codes. Error bodies always carry error.type and error.message (see Errors).

Start here

Create an API key

Make your first request

Follow the Quickstart to synthesize your first sentence of speech.

Go realtime

Open a websocket to the Realtime API for live, two-way conversation.

What you can build#

Conventions#

Start here#

What you can build

Conventions

Start here