Telnyx

Required

text

string

required

Text to synthesize. Plain text or SSML (when text_type is "ssml" and supported by the provider).

voice

string

required

Voice identifier in Provider.Model.VoiceId format. Examples: Telnyx.Ultra.aura, Telnyx.NaturalHD.astra, aws.Polly.Neural.Joanna.

Output

output_type

string

Response format. Default: binary_output.

binary_output — raw audio bytes with Content-Type header
base64_output — JSON with base64_audio field
audio_id — returns an ID for async retrieval via GET /v2/text-to-speech/speech/:audio_id

Content

text_type

string

Input text format: text (default) or ssml. SSML supported by AWS Polly and Azure.

language

string

Language code (ISO 639-1 or full language name). Used by providers that support language selection.

Voice Settings

voice_settings

object

Provider-specific parameters. See Voice Settings.

Pronunciation

pronunciation_dict_id

string

ID of a pronunciation dictionary to apply custom word replacements before synthesis.

Caching

disable_cache

boolean

When true, bypass audio cache and synthesize fresh. Default: false.

WebSocket

REST API

In-Call Playback

API Reference

Request Body

Required

Output

Content

Voice Settings

Pronunciation

Caching

WebSocket

REST API

In-Call Playback

API Reference

​Required

​Output

​Content

​Voice Settings

​Pronunciation

​Caching

Required

Output

Content

Voice Settings

Pronunciation

Caching