POST /transcriptions

Transcribe an audio file to text with optional timestamps and diarization.

Transcribes an uploaded audio file.

POST /v1/transcriptions
Content-Type: multipart/form-data

Form fields

Field	Type	Required	Description
`model`	string	yes	`vocenza-stt-1` or `vocenza-stt-1-flash`.
`file`	binary	yes	`wav`, `mp3`, `m4a`, `flac`, or `ogg`.
`language`	string	no	ISO-639-1 hint. Omit to auto-detect.
`timestamps`	string	no	`none`, `segment`, or `word`. Default `segment`.
`diarize`	boolean	no	Label speakers when `true`.

Example

curl https://api.vocenza.com/v1/transcriptions \
  -H "Authorization: Bearer $VOCENZA_API_KEY" \
  -F model="vocenza-stt-1" \
  -F file="@call.mp3" \
  -F timestamps="word"

Response

{
  "text": "Thanks for calling Vocenza.",
  "language": "en",
  "duration": 1.92,
  "words": [
    { "word": "Thanks", "start": 0.08, "end": 0.41 }
  ]
}

Field	Type	Description
`text`	string	Full transcript.
`language`	string	Detected (or supplied) language.
`duration`	number	Audio length in seconds.
`words`	array	Present when `timestamps=word`.
`segments`	array	Present when `diarize=true` or `timestamps=segment`.

Form fields#

Example#

Response#

Form fields

Example

Response