Generate synthesized speech audio from text input. Returns audio in the requested format (binary audio stream, base64-encoded JSON, or an audio URL for later retrieval).
Authentication is provided via the standard Authorization: Bearer <API_KEY> header.
The voice parameter provides a convenient shorthand to specify provider, model, and voice in a single string (e.g. telnyx.NaturalHD.Alloy). Alternatively, specify provider explicitly along with provider-specific parameters.
Supported providers: aws, telnyx, azure, elevenlabs, minimax, rime, resemble.
Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
Request body for generating speech from text.
Voice identifier in the format provider.model_id.voice_id or provider.voice_id. Examples: telnyx.NaturalHD.Alloy, azure.en-US-AvaMultilingualNeural, aws.Polly.Generative.Lucia. When provided, provider, model_id, and voice_id are extracted automatically and take precedence over individual parameters.
The text to convert to speech.
TTS provider. Required unless voice is provided.
aws, telnyx, azure, elevenlabs, minimax, rime, resemble Language code (e.g. en-US). Usage varies by provider.
Text type. Use ssml for SSML-formatted input (supported by AWS and Azure).
text, ssml Determines the response format. binary_output returns raw audio bytes, base64_output returns base64-encoded audio in JSON.
binary_output, base64_output When true, bypass the audio cache and generate fresh audio.
Provider-specific voice settings. Contents vary by provider — see provider-specific parameter objects below.
AWS Polly provider-specific parameters.
Telnyx provider-specific parameters.
Azure Cognitive Services provider-specific parameters.
ElevenLabs provider-specific parameters.
Minimax provider-specific parameters.
Rime provider-specific parameters.
Resemble AI provider-specific parameters.
Speech generated successfully. The response format depends on the output_type parameter:
binary_output (default): Returns raw audio bytes with the appropriate Content-Type header (e.g. audio/mpeg).base64_output: Returns a JSON object with base64_audio field.Raw audio bytes. Returned when output_type is binary_output (default).