input_format query parameter. Audio is sent as binary WebSocket frames — chunked bytes, no base64, no JSON wrapping.
Container formats (mp3, webm, etc.) are self-describing: the server demuxes the byte stream and extracts encoding/sample rate from headers. Raw formats have no metadata, so you must set sample_rate explicitly.
Works for both real-time capture (microphone, MediaRecorder, telephony bridge) and file streaming (read a file in chunks, push through the socket).
Browser Capture
Output fromMediaRecorder or similar browser APIs. Container headers carry sample rate.
| Format | Sample rate | Notes |
|---|---|---|
webm | from header | WebM container |
webm_opus | from header | WebM + Opus. Valid: 8000–48000. Alias: webm-opus |
ogg_opus | from header | Ogg + Opus. Valid: 8000–48000. Alias: ogg-opus |
ogg | from header | Ogg container (Vorbis or other) |
Telephony
Codecs from voice networks. Raw frames,sample_rate required.
| Format | Sample rate | Notes |
|---|---|---|
mulaw | any | G.711 µ-law. North America. Default: 8000 Hz. |
alaw | any | G.711 A-law. EU/international. Default: 8000 Hz. |
g729 | 8000 | G.729. Fixed. |
amr_nb | 8000 | AMR narrowband. Fixed. Alias: amr-nb |
amr_wb | 16000 | AMR wideband. Fixed. Alias: amr-wb |
speex | 8000, 16000, 32000 | Google: 16000 only. |
Raw PCM
Uncompressed audio from microphones, processing pipelines, or SDKs.sample_rate required.
| Format | Sample rate | Notes |
|---|---|---|
linear16 | any | 16-bit signed PCM, little-endian (s16le). Default: 16000 Hz. |
linear32 | any | 32-bit float PCM, little-endian (f32le). Default: 16000 Hz. |
opus | 8000, 12000, 16000, 24000, 48000 | Raw Opus frames, no container. Deepgram also: 44100. |
Recorded File
Pre-recorded files read in chunks and streamed through the socket. Container headers carry sample rate.| Format | Sample rate | Notes |
|---|---|---|
mp3 | from header | Default for most engines |
wav | from header | Uncompressed. Default for Flux model. |
flac | from header | Lossless compression |
Engine Compatibility
Unsupported format/engine combination returns error 40002. Unsupported Flux format returns error 40006. Deepgram has three model generations with different format support. Flux is the most restrictive — it dropsmp3, flac, webm_opus, amr_nb, amr_wb, g729, and speex compared to Nova.
| Format | Deepgram Nova | Deepgram Flux | Telnyx | Azure | |
|---|---|---|---|---|---|
| mp3 | ✓ | ✓ | ✓ | ✓ | |
| wav | ✓ | ✓ | ✓ | ✓ | ✓ |
| webm | ✓ | ✓ | |||
| ogg | ✓ | ✓ | |||
| flac | ✓ | ✓ | |||
| ogg_opus | ✓ | ✓ | ✓ | ||
| webm_opus | ✓ | ✓ | |||
| linear16 | ✓ | ✓ | ✓ | ✓ | ✓ |
| linear32 | ✓ | ✓ | ✓ | ||
| mulaw | ✓ | ✓ | ✓ | ||
| alaw | ✓ | ✓ | |||
| opus | ✓ | ✓ | |||
| amr_nb | ✓ | ✓ | |||
| amr_wb | ✓ | ✓ | |||
| g729 | ✓ | ||||
| speex | ✓ | ✓ |
wav, linear16.