Required
Text to synthesize. Plain text or SSML (when
text_type is "ssml" and supported by the provider).Voice identifier in
Provider.Model.VoiceId format. Examples: Telnyx.Ultra.aura, Telnyx.NaturalHD.astra, aws.Polly.Neural.Joanna.Output
Response format. Default:
binary_output.binary_output— raw audio bytes withContent-Typeheaderbase64_output— JSON withbase64_audiofieldaudio_id— returns an ID for async retrieval viaGET /v2/text-to-speech/speech/:audio_id
Content
Input text format:
text (default) or ssml. SSML supported by AWS Polly and Azure.Language code (ISO 639-1 or full language name). Used by providers that support language selection.
Voice Settings
Provider-specific parameters. See Voice Settings.
Pronunciation
ID of a pronunciation dictionary to apply custom word replacements before synthesis.
Caching
When
true, bypass audio cache and synthesize fresh. Default: false.