Skip to main content
POST
/
ai
/
audio
/
transcriptions
JavaScript
import Telnyx from 'telnyx';

const client = new Telnyx({
  apiKey: 'My API Key',
});

const response = await client.ai.audio.transcribe({ model: 'distil-whisper/distil-large-v2' });

console.log(response.text);
{
  "text": "<string>",
  "duration": 123,
  "segments": [
    {
      "id": 123,
      "start": 123,
      "end": 123,
      "text": "<string>"
    }
  ]
}

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

multipart/form-data
model
enum<string>
default:distil-whisper/distil-large-v2
required

ID of the model to use. distil-whisper/distil-large-v2 is lower latency but English-only. openai/whisper-large-v3-turbo is multi-lingual but slightly higher latency.

Available options:
distil-whisper/distil-large-v2,
openai/whisper-large-v3-turbo
Example:

"distil-whisper/distil-large-v2"

file
file

The audio file object to transcribe, in one of these formats: flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, or webm. File uploads are limited to 100 MB. Cannot be used together with file_url

file_url
string

Link to audio file in one of these formats: flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, or webm. Support for hosted files is limited to 100MB. Cannot be used together with file

Example:

"https://example.com/file.mp3"

response_format
enum<string>
default:json

The format of the transcript output. Use verbose_json to take advantage of timestamps.

Available options:
json,
verbose_json
Example:

"json"

timestamp_granularities[]
enum<string>

The timestamp granularities to populate for this transcription. response_format must be set verbose_json to use timestamp granularities. Currently segment is supported.

Available options:
segment
Example:

"segment"

Response

Successful Response

text
string
required

The transcribed text for the audio file.

duration
number

The duration of the audio file in seconds. This is only included if response_format is set to verbose_json.

segments
object[]

Segments of the transcribed text and their corresponding details. This is only included if response_format is set to verbose_json.