Skip to main content

Comparison

EngineModel (WebSocket)Model (REST)LatencyLanguagesBest for
Deepgramnova-3deepgram/nova-3Low40+ (reference)Recommended. Highest English accuracy, diarization, word timestamps
Deepgramnova-2deepgram/nova-2Low40+Legacy — use nova-3 unless you have a specific reason
DeepgramfluxLowestEnglish onlyVoice agents — built-in end-of-turn detection (WebSocket only)
Telnyxopenai/whisper-large-v3-turboopenai/whisper-large-v3-turboMedium50+ (reference)Multilingual transcription
Telnyxopenai/whisper-tinyopenai/whisper-tinyLow50+Lightweight, on-network
Googlelatest_longMedium125+ (reference)Long-form multilingual audio (WebSocket only)
Azureazure/fastMedium100+ (reference)Broad language and accent coverage (WebSocket only)

Engine Details

The default WebSocket engine. Best English accuracy and the richest feature set. For REST, you must explicitly set model="deepgram/nova-3" — the REST default is openai/whisper-large-v3-turbo.Models:
  • nova-3 — Latest and most accurate. Supports diarization, word-level timestamps, smart formatting, numerals, and punctuation via model_config. Use this unless you need the lowest possible latency.
  • nova-2 — Previous generation. Still supported but nova-3 is better in all benchmarks.
  • flux — Purpose-built for voice agents. Lowest latency with built-in end-of-turn detection — tells you when the speaker has finished so your agent can respond. WebSocket only.
Languages: 40+ languages. Nova-3 supports multi mode (10 languages with code-switching). Flux is English only. See Deepgram languages.

How to Choose

Need the highest accuracy for English? → Deepgram nova-3 — best WER (word error rate) across all English variants. Building a voice agent that needs to know when the user stopped talking? → Deepgram flux — lowest latency with built-in end-of-turn detection. Need to transcribe files in 50+ languages? → Telnyx openai/whisper-large-v3-turbo via REST API. Need diarization (who said what)? → Deepgram nova-3 with model_config.diarize: true. Need broad accent/dialect support? → Azure azure/fast — strong coverage across regional accents.

Specifying the Engine and Model

WebSocket — set via query parameters:
wss://api.telnyx.com/v2/speech-to-text/transcription?transcription_engine=Deepgram&model=nova-3
REST API — set via the model body parameter:
curl -X POST https://api.telnyx.com/v2/ai/audio/transcriptions \
  -H "Authorization: Bearer YOUR_TELNYX_API_KEY" \
  -F model="deepgram/nova-3" \
  -F file=@audio.mp3