Transcribe speech to text
POST/ai/audio/transcriptions
Transcribe speech to text. This endpoint is consistent with the OpenAI Transcription API and may be used with the OpenAI JS or Python SDK.
Request
- multipart/form-data
Body
required
The audio file object to transcribe, in one of these formats: flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, or webm. File uploads are limited to 100 MB. Cannot be used together with file_url
Link to audio file in one of these formats: flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, or webm. Support for hosted files is limited to 100MB. Cannot be used together with file
Possible values: [distil-whisper/distil-large-v2
, openai/whisper-large-v3-turbo
]
Default value: distil-whisper/distil-large-v2
ID of the model to use. distil-whisper/distil-large-v2
is lower latency but English-only. openai/whisper-large-v3-turbo
is multi-lingual but slightly higher latency.
Possible values: [json
, verbose_json
]
Default value: json
The format of the transcript output. Use verbose_json
to take advantage of timestamps.
Possible values: [segment
]
The timestamp granularities to populate for this transcription. response_format
must be set verbose_json to use timestamp granularities. Currently segment
is supported.
Responses
200: Successful Response
- application/json
422: Validation Error
- application/json
Request samples
curl -L -X POST 'https://api.telnyx.com/v2/ai/audio/transcriptions' \
-H 'Content-Type: multipart/form-data' \
-H 'Accept: application/json' \
-H 'Authorization: Bearer <TOKEN>'
Response samples
{
"text": "string",
"duration": 0,
"segments": [
{
"id": 0,
"start": 0,
"end": 0,
"text": "string"
}
]
}
{
"detail": [
{
"loc": [
"string",
0
],
"msg": "string",
"type": "string"
}
]
}