POST /transcriptions

Transcribe an audio file to text with optional timestamps and diarization.

Transcribes an uploaded audio file.

POST /v1/transcriptions
Content-Type: multipart/form-data

Form fields

FieldTypeRequiredDescription
modelstringyesvocenza-stt-1 or vocenza-stt-1-flash.
filebinaryyeswav, mp3, m4a, flac, or ogg.
languagestringnoISO-639-1 hint. Omit to auto-detect.
timestampsstringnonone, segment, or word. Default segment.
diarizebooleannoLabel speakers when true.

Example

curl https://api.vocenza.com/v1/transcriptions \
  -H "Authorization: Bearer $VOCENZA_API_KEY" \
  -F model="vocenza-stt-1" \
  -F file="@call.mp3" \
  -F timestamps="word"

Response

{
  "text": "Thanks for calling Vocenza.",
  "language": "en",
  "duration": 1.92,
  "words": [
    { "word": "Thanks", "start": 0.08, "end": 0.41 }
  ]
}
FieldTypeDescription
textstringFull transcript.
languagestringDetected (or supplied) language.
durationnumberAudio length in seconds.
wordsarrayPresent when timestamps=word.
segmentsarrayPresent when diarize=true or timestamps=segment.