Telnyx

Voice Format

Provider.Model.VoiceId

Examples:

Telnyx.NaturalHD.astra
aws.Polly.Generative.Lucia
azure.en-US-AvaMultilingualNeural
elevenlabs.v3.Adam

Dots are allowed within model IDs. The voice parser handles multi-segment names like aws.Polly.Generative.Lucia correctly.

Provider Summary

Provider	Key	Models	Audio Delivery
Telnyx	`telnyx`	Natural, NaturalHD, KokoroTTS, Qwen3TTS	Streamed
AWS Polly	`aws`	standard, neural, generative, long-form	Concatenated
Azure Speech	`azure`	Neural voices	Concatenated
ElevenLabs	`elevenlabs`	v2, v3, MultiPL.v2	Direct relay
Minimax	`minimax`	—	Streamed
Rime	`rime`	ArcanaV3	Streamed
Resemble	`resemble`	Turbo (default)	Streamed
Inworld	`inworld`	inworld-tts-1.5-mini, inworld-tts-1.5-max	Streamed

Streamed providers send audio in incremental frames — the audio field on the text-bearing chunk is null. Concatenated providers return full audio in a single chunk. Direct relay means frames are forwarded to the upstream provider’s WebSocket.

Telnyx Ultra is not available over WebSocket. Use the REST API for Ultra.

Telnyx

Model	Description	Languages
Natural	Fast, low-latency synthesis	English
NaturalHD	Higher quality, supports multiple languages	en, fr, de, es, ar, hi, ja, he, pt
KokoroTTS	Lightweight model	—
Qwen3TTS	Voice cloning. Requires a cloned voice name as `voice_id`.	en, zh, fr, de, it, ja, ko, pt, ru, es

Voice IDs for Natural/NaturalHD correspond to pre-built voices. Browse available voices via the Voices API endpoint or the Voice Design Lab. Qwen3TTS voices require a voice clone created in the Voice Design Lab. The voice_id is the clone name. Cloned voice usage may require identity verification.

AWS Polly

Voice format: aws.Polly.<Engine>.<VoiceId> Engines: standard, neural, generative, long-form. Example: aws.Polly.Generative.Lucia Engine is parsed from the voice ID suffix (e.g., a voice ending in -longform maps to the long-form engine). Supports SSML input via text_type: "ssml" in voice settings. Voices: AWS Polly voice list

Azure Speech

Voice format: azure.<VoiceId> Example: azure.en-US-AvaMultilingualNeural Default voice: en-US-AvaMultilingualNeural. Default output format: audio-24khz-160kbitrate-mono-mp3. Supports SSML input and audio effects (eq_car, eq_telecomhp8k). Voices: Azure Speech voices

ElevenLabs

ElevenLabs connections are relayed directly to the ElevenLabs WebSocket API. Frames pass through without going through the standard text buffering pipeline.

Requires an ElevenLabs API key (configured in voice settings or account config). Voices: ElevenLabs voice library

Minimax

Supports voice cloning. Cloned voices are scoped to your organization. Voice settings: speed (float), vol (float), pitch (integer), language_boost (string).

Rime

Voice format: Rime.ArcanaV3.<VoiceId>

Resemble

Self-hosted synthesis engine. Voice settings: precision (PCM_16, PCM_24, PCM_32, MULAW), sample_rate (8000–48000), format (wav, mp3). Default model: Turbo. Default format: mp3.

Inworld

Models: inworld-tts-1.5-mini (faster), inworld-tts-1.5-max (higher quality). Aliases: Mini, Max. Encodings: MP3, LINEAR16. Default: LINEAR16 for WebSocket, MP3 for REST.

Voices API

List available voices:

GET https://api.telnyx.com/v2/text-to-speech/voices

Filter by provider:

GET https://api.telnyx.com/v2/text-to-speech/voices?provider=telnyx

Get a specific voice:

GET https://api.telnyx.com/v2/text-to-speech/voices?voice_id=Telnyx.NaturalHD.astra

WebSocket

REST API

In-Call Playback

API Reference

Engines & Voices

Voice Format

Provider Summary

Telnyx

AWS Polly

Azure Speech

ElevenLabs

Minimax

Rime

Resemble

Inworld

Voices API

WebSocket

REST API

In-Call Playback

API Reference

​Voice Format

​Provider Summary

​Telnyx

​AWS Polly

​Azure Speech

​ElevenLabs

​Minimax

​Rime

​Resemble

​Inworld

​Voices API

Voice Format

Provider Summary

Telnyx

AWS Polly

Azure Speech

ElevenLabs

Minimax

Rime

Resemble

Inworld

Voices API