Stream text to speech over WebSocket
AsyncAPI specification for the Telnyx Text-to-Speech WebSocket endpoint. Real-time speech synthesis by streaming text and receiving audio chunks.
Supported Providers
telnyx- Telnyx native voices (Natural, NaturalHD, Qwen3TTS)aws- Amazon Pollyazure- Microsoft Azure TTSelevenlabs- ElevenLabs voicesminimax- MiniMax voicesrime- Rime voicesresemble- Resemble AI voicesxai- xAI voices (Eve, Ara, Rex, Sal, Leo)inworld- Inworld AI voices
Connection Flow
- Open WebSocket connection to
wss://api.telnyx.com/v2/text-to-speech/speechwith query parameters. - Send an initial handshake message
{"text": " "}(single space) with optionalvoice_settings. - Send text messages as
{"text": "Hello world"}. - Receive audio chunks as JSON frames with base64-encoded audio.
- A final frame with
isFinal: trueindicates the end of audio for the current text.
Authentication
Requires authentication via a Bearer token (Telnyx API v2 key).
WSS
Messages
Messages