Default Behavior
WebSocket audio is base64-encoded in JSON frames. The default format depends on the provider and model. Use theaudio_format query parameter to override:
Format Support by Provider
| Provider | Supported Formats |
|---|---|
| Telnyx | mp3, linear16 |
| AWS Polly | mp3, linear16, ogg_vorbis |
| Azure | mp3, wav, linear16, mulaw, alaw |
| ElevenLabs | mp3, linear16, mulaw |
| Rime | mp3, linear16 |
| Minimax | mp3, linear16 |
| Resemble | mp3, wav |
| Inworld | mp3, linear16 |
| Qwen | mp3, linear16 |
Accepted Sample Rates
| Provider | Accepted Sample Rates |
|---|---|
| Telnyx/Rime | 8000, 16000, 22050, 24000, 44100, 48000, 96000 |
| Telnyx/Cartesia | 8000, 16000, 22050, 24000, 44100 |
| Telnyx/Qwen | 24000 |
| Telnyx/Kokoro | 24000 |
| Telnyx/LibriTTS | 24000 |
| Rime (direct) | 8000, 16000, 22050, 24000, 44100, 48000, 96000 |
| Qwen (direct) | 24000 |
| Azure | 8000, 16000, 24000, 48000 |
| AWS | 8000, 16000, 22050, 24000 |
| Minimax | 8000, 16000, 22050, 24000, 32000, 44100 |
| Resemble | 8000, 16000, 22050, 32000, 44100, 48000 |
| Inworld | 8000, 16000, 22050, 24000, 44100, 48000 |
| ElevenLabs | 8000, 16000, 22050, 24000, 44100 |