Comparison
| Engine | Model (WebSocket) | Model (REST) | Latency | Languages | Best for |
|---|---|---|---|---|---|
| Deepgram | nova-3 | deepgram/nova-3 | Low | 40+ (reference) | Recommended. Highest English accuracy, diarization, word timestamps |
| Deepgram | nova-2 | deepgram/nova-2 | Low | 40+ | Legacy — use nova-3 unless you have a specific reason |
| Deepgram | flux | — | Lowest | English only | Voice agents — built-in end-of-turn detection (WebSocket only) |
| Telnyx | openai/whisper-large-v3-turbo | openai/whisper-large-v3-turbo | Medium | 50+ (reference) | Multilingual transcription |
| Telnyx | openai/whisper-tiny | openai/whisper-tiny | Low | 50+ | Lightweight, on-network |
latest_long | — | Medium | 125+ (reference) | Long-form multilingual audio (WebSocket only) | |
| Azure | azure/fast | — | Medium | 100+ (reference) | Broad language and accent coverage (WebSocket only) |
Engine Details
- Deepgram
- Telnyx
- Google
- Azure
The default WebSocket engine. Best English accuracy and the richest feature set. For REST, you must explicitly set
model="deepgram/nova-3" — the REST default is openai/whisper-large-v3-turbo.Models:nova-3— Latest and most accurate. Supports diarization, word-level timestamps, smart formatting, numerals, and punctuation viamodel_config. Use this unless you need the lowest possible latency.nova-2— Previous generation. Still supported but nova-3 is better in all benchmarks.flux— Purpose-built for voice agents. Lowest latency with built-in end-of-turn detection — tells you when the speaker has finished so your agent can respond. WebSocket only.
multi mode (10 languages with code-switching). Flux is English only. See Deepgram languages.How to Choose
Need the highest accuracy for English? → Deepgramnova-3 — best WER (word error rate) across all English variants.
Building a voice agent that needs to know when the user stopped talking?
→ Deepgram flux — lowest latency with built-in end-of-turn detection.
Need to transcribe files in 50+ languages?
→ Telnyx openai/whisper-large-v3-turbo via REST API.
Need diarization (who said what)?
→ Deepgram nova-3 with model_config.diarize: true.
Need broad accent/dialect support?
→ Azure azure/fast — strong coverage across regional accents.
Specifying the Engine and Model
WebSocket — set via query parameters:model body parameter: