Skip to main content

Models

ModelVoice FormatLatencyLanguages
NaturalTelnyx.Natural.<voice>LowEnglish
NaturalHDTelnyx.NaturalHD.<voice>Mediumen, fr, de, es, ar, hi, ja, he, pt
KokoroTTSTelnyx.KokoroTTS.<voice>Low
Qwen3TTSTelnyx.Qwen3TTS.<clone_name>Mediumen, zh, fr, de, it, ja, ko, pt, ru, es
UltraTelnyx.Ultra.<voice>MediumMultilingual
Ultra is not available over WebSocket. Use the REST API for Ultra.

Natural & NaturalHD

Pre-built voices. Browse available voices via the Voices API or the Voice Design Lab.

Audio Format

Default: MP3. NaturalHD supports audio_format query parameter to override:
?voice=Telnyx.NaturalHD.astra&audio_format=pcm
Accepted values: pcm, wav.

Voice Settings

FieldTypeDescription
voice_speedfloatPlayback speed multiplier
embedding_scalefloatVoice embedding intensity
diffusion_stepsintegerQuality/latency tradeoff — more steps = higher quality
phonemizerstringPhonemizer backend selection
response_formatstringOutput format override
sampling_rateintegerSample rate in Hz
temperaturefloatSynthesis variability
volumefloatOutput volume
emotionstringEmotional tone

Qwen3TTS

Voice cloning model. The voice_id is the name of a clone created in the Voice Design Lab. Cloned voice usage may require identity verification. Requires the clone to belong to your organization.

Audio Format

Always raw PCM — 24kHz, signed 16-bit little-endian, mono. Forced by the backend regardless of any output_format value sent.

Voice Settings

FieldTypeDefaultDescription
language_booststring"Auto"Target language. Accepted: Auto, English, Chinese, French, German, Italian, Japanese, Korean, Portuguese, Russian, Spanish, or ISO codes
force_xvectorbooleanfalseForce x-vector voice embedding

KokoroTTS

Lightweight, low-latency model. Suitable for high-throughput applications where quality tradeoffs are acceptable.