Skip to main content

Voice Format

Provider.Model.VoiceId
All providers available on WebSocket are also available via REST. The REST API additionally supports Ultra.

Telnyx

ModelWebSocketRESTDescription
NaturalYesYesFast, English-only
NaturalHDYesYesHigher quality, multilingual (en, fr, de, es, ar, hi, ja, he, pt)
UltraNoYesSub-100ms latency, 44 languages, emotion/speed/volume control
KokoroTTSYesYesLightweight synthesis
Qwen3TTSYesYesVoice cloning (en, zh, fr, de, it, ja, ko, pt, ru, es)

Ultra

Ultra supports 44 languages and expressive controls:
ParameterTypeRangeDescription
emotionstringEmotion tag: neutral, happy, sad, angry, etc.
speedfloat0.5–2.0Speech rate (1.0 = normal)
volumefloat0.0–2.0Volume level (1.0 = normal)
Language is auto-detected from text. Pass language parameter to force a specific language (ISO 639-1 code or full name).

Qwen3TTS

Requires a voice clone created via the Voice Design Lab. The voice_id is the clone name. Cloned voice usage requires L2 identity verification or the voice.cloned_voice_usage: allowed capability on your organization.

AWS Polly

Voice format: aws.Polly.<Engine>.<VoiceId> Engines: standard, neural, generative, long-form. Supports SSML via text_type: "ssml".

Azure Speech

Voice format: azure.<VoiceId> Supports SSML and audio effects (eq_car, eq_telecomhp8k).

ElevenLabs

Requires ElevenLabs API key. Models: v2, v3, MultiPL.v2.

Minimax

Supports system voices and organization-scoped voice clones.

Rime

Model: ArcanaV3.

Resemble

Self-hosted synthesis. Model: Turbo (default). Supports wav and mp3.

Inworld

Models: inworld-tts-1.5-mini (faster), inworld-tts-1.5-max (higher quality).

OpenAI SDK Mapping

OpenAI modelTelnyx Provider/Model
tts-1Telnyx.NaturalHD
tts-1-hdTelnyx.NaturalHD
gpt-4o-mini-ttsTelnyx.Ultra