Skip to main content
The output_type request field controls what comes back.

Streaming Audio (default)

With output_type: "binary_output" (or omitted), the response is raw audio over HTTP chunked transfer encoding:
HTTP/1.1 200 OK
Content-Type: audio/mpeg
Transfer-Encoding: chunked

<audio chunk 1>
<audio chunk 2>
...
Start reading the body immediately — don’t buffer the full response.

Base64

With output_type: "base64_output", the full audio is returned as a JSON payload after synthesis completes:
{"base64_audio": "<base64-encoded-audio>"}
No streaming — the entire file must synthesize before the response is sent.

Async (audio_id)

With output_type: "audio_id", synthesis runs in the background. You get a URL back immediately:
{"audio_url": "https://api.telnyx.com/v2/text-to-speech/speech/<id>"}
Retrieve the audio later with GET /v2/text-to-speech/speech/:audio_id. If the audio is still synthesizing, the GET response itself streams chunks as they become available.