Endpoint
Example
Request Body
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
text | string | Yes | — | Text to synthesize. Markdown is automatically stripped. |
voice | string | Yes | — | Dot-separated voice identifier. Format: Provider.Model.VoiceId (e.g., Telnyx.NaturalHD.astra) or Provider.VoiceId when the provider has a single model. |
output_type | string | No | binary_output | Response format: binary_output, base64_output, or audio_id. |
language | string | No | — | BCP-47 language code (e.g., en-US). Supported by AWS Polly, Azure, ElevenLabs, and Inworld. Ignored by other providers. |
text_type | string | No | text | text or ssml. SSML is supported by AWS Polly and Azure. Ultra has its own SSML emotion syntax. |
voice_settings | object | No | — | Provider-specific tuning (speed, pitch, format, emotion). Fields vary by provider — see individual provider pages. |
pronunciation_dict_id | string | No | — | UUID of a custom pronunciation dictionary. Word replacements are applied before synthesis. |
disable_cache | boolean | No | false | Bypass the audio cache and always synthesize fresh. |