Skip to main content

Output Type

output_type controls how audio is returned in the HTTP response:
ValueResponse
binary_output (default)Raw audio bytes. Content-Type header set to the audio MIME type (e.g., audio/mpeg).
base64_outputJSON body: {"base64_audio": "<base64>"}
audio_idJSON body with an audio_id for later retrieval via GET /v2/text-to-speech/speech/:audio_id

Common Format Vocabulary

FormatDescription
mp3MPEG Layer 3
wavWAV container (PCM)
linear16Raw 16-bit PCM
mulawμ-law encoded
alawA-law encoded
ogg_vorbisOGG Vorbis
pcmAlias for linear16 (backward compat)

Provider Format Support Matrix

ProviderSupported formatssample_rate
telnyxmp3, linear16pass-through
awsmp3, linear16, ogg_vorbispass-through
azuremp3, wav, linear16, mulaw, alaw8000 / 16000 / 24000 / 48000
elevenlabsmp3, linear16, mulawcodec-specific
rimemp3, linear16pass-through
minimaxmp3, linear168000 / 16000 / 22050 / 24000 / 32000 / 44100
resemblemp3, wavpass-through
inworldmp3, linear16pass-through
qwenmp3, linear16N/A (fixed 24kHz)