Text-to-Speech WebSocket configuration

Two Configuration Surfaces

Surface	When	What	Mutable?
Query parameters	WebSocket URL	Voice selection, audio format, sample rate, connection options	No — locked at connect
`voice_settings`	Init frame (`{"text": " "}`)	Provider-specific tuning (speed, pitch, format, etc.)	No — locked at init

Both are one-shot. After the init frame, no configuration can change for the session. To change settings, open a new connection.

Query Parameters

Set on the URL at connect time. Immutable for the session.

Voice Selection

Parameter	Type	Default	Description
`voice`	string	—	Voice identifier in `Provider.Model.VoiceId` format.

The voice_id segment (third part of the voice string) refers to different things depending on the provider and model:

Type	Example	How you get it
Pre-built voice	`Telnyx.NaturalHD.astra`	Browse via the Voices API or Voice Design. Shipped by the provider — available to everyone.
Your cloned voice	`Telnyx.Qwen3TTS.my-ceo-clone`	Create in the Voice Design. Scoped to your organization — only your API key can use it. Available for Qwen3TTS and Minimax.
BYOK provider voice	`elevenlabs.v3.Adam`	A voice ID from your own ElevenLabs or Resemble account. You bring your own API key; Telnyx relays the request.

The Voices API (GET /v2/ai/tts/voices) returns all voices available to your account — pre-built and cloned — with each voice’s compound id ready to use as the voice parameter.

Connection Options

Parameter	Type	Default	Description
`language`	string	—	BCP-47 language code. Passed to the provider as `language_code`. Only used by providers that accept it (AWS Polly, Azure, ElevenLabs, Inworld).
`text_type`	string	`text`	Text type hint: `text` or `ssml`. Only AWS Polly and Azure use this.
`audio_format`	string	`mp3`	Output audio format: `mp3`, `linear16`, `wav`, `mulaw`, `alaw`, `ogg_vorbis`. Not all formats are supported by every provider — see providers dedicated pages.
`sample_rate`	integer	provider default	Output sample rate in Hz. Accepted values vary by provider — see providers dedicated pages.

Example

wss://api.telnyx.com/v2/text-to-speech/speech?voice=Telnyx.NaturalHD.astra&audio_format=linear16&disable_cache=true

Voice Settings

Provider-specific tuning (speed, pitch, format, emotion, etc.) is not set via query parameters. It is passed once in the voice_settings object on the initialization frame:

{
  "text": " ",
  "voice_settings": {
    "voice_speed": 1.2,
    "emotion": "happy"
  }
}

Voice settings are applied when the synthesis worker starts and cannot be changed mid-session. There are no common voice_settings fields. Every field is provider-specific — the available fields, defaults, and accepted values are completely different per provider. Unrecognized fields are silently ignored. See your selected provider’s page under Providers for the exact fields.

WebSocket Streaming

REST API

Providers

Other

API Reference

For AI Agents

Text-to-Speech WebSocket configuration

Two Configuration Surfaces

Query Parameters

Voice Selection

Connection Options

Example

Voice Settings

​Two Configuration Surfaces

​Query Parameters

​Voice Selection

​Connection Options

​Example

​Voice Settings

Two Configuration Surfaces

Query Parameters

Voice Selection

Connection Options

Example

Voice Settings