POST /transcriptions
Transcribe an audio file to text with optional timestamps and diarization.
Transcribes an uploaded audio file.
POST /v1/transcriptions
Content-Type: multipart/form-dataForm fields
| Field | Type | Required | Description |
|---|---|---|---|
model | string | yes | vocenza-stt-1 or vocenza-stt-1-flash. |
file | binary | yes | wav, mp3, m4a, flac, or ogg. |
language | string | no | ISO-639-1 hint. Omit to auto-detect. |
timestamps | string | no | none, segment, or word. Default segment. |
diarize | boolean | no | Label speakers when true. |
Example
curl https://api.vocenza.com/v1/transcriptions \
-H "Authorization: Bearer $VOCENZA_API_KEY" \
-F model="vocenza-stt-1" \
-F file="@call.mp3" \
-F timestamps="word"Response
{
"text": "Thanks for calling Vocenza.",
"language": "en",
"duration": 1.92,
"words": [
{ "word": "Thanks", "start": 0.08, "end": 0.41 }
]
}| Field | Type | Description |
|---|---|---|
text | string | Full transcript. |
language | string | Detected (or supplied) language. |
duration | number | Audio length in seconds. |
words | array | Present when timestamps=word. |
segments | array | Present when diarize=true or timestamps=segment. |