Check out our upcoming events and meetups! View events →
AsyncAPI specification for the Telnyx Text-to-Speech WebSocket endpoint. Real-time speech synthesis by streaming text and receiving audio chunks.
telnyx - Telnyx native voices (Natural, NaturalHD, Qwen3TTS)aws - Amazon Pollyazure - Microsoft Azure TTSelevenlabs - ElevenLabs voicesminimax - MiniMax voicesrime - Rime voicesresemble - Resemble AI voicesinworld - Inworld AI voiceswss://api.telnyx.com/v2/text-to-speech/speech with query parameters.{"text": " "} (single space) with optional voice_settings.{"text": "Hello world"}.isFinal: true indicates the end of audio for the current text.Requires authentication via a Bearer token (Telnyx API v2 key).
Documentation Index
Fetch the complete documentation index at: https://developers.telnyx.com/llms.txt
Use this file to discover all available pages before exploring further.
{
"text": " ",
"voice_settings": {
"voice_speed": 1.2
}
}{
"audio": "QmFzZTY0RW5jb2RlZEF1ZGlv",
"text": "Hello world",
"isFinal": false,
"cached": false,
"timeToFirstAudioFrameMs": 245
}{
"audio": null,
"text": "",
"isFinal": true
}{
"error": "Invalid voice_id specified"
}Telnyx API v2 Bearer token authentication.
Query parameters passed when opening the WebSocket connection.
Client-to-server frame containing text to synthesize. The initial handshake message should be {"text": " "} (single space) with optional voice_settings. Subsequent messages contain actual text. To interrupt synthesis mid-stream, send {"force": true}.
Server-to-client frame containing a base64-encoded audio chunk. For providers that stream audio in real-time (Telnyx Natural/NaturalHD, Rime, Minimax, Resemble, Inworld), text will be null because audio is streamed before full text alignment is available, and cached will be false. For other providers, text contains the corresponding text segment.
Server-to-client frame indicating synthesis is complete for the current text. The connection remains open for additional text messages.
Server-to-client frame indicating an error during synthesis. The connection will be closed shortly after sending this frame.
Was this page helpful?
{
"text": " ",
"voice_settings": {
"voice_speed": 1.2
}
}{
"audio": "QmFzZTY0RW5jb2RlZEF1ZGlv",
"text": "Hello world",
"isFinal": false,
"cached": false,
"timeToFirstAudioFrameMs": 245
}{
"audio": null,
"text": "",
"isFinal": true
}{
"error": "Invalid voice_id specified"
}