Connection Recovery
Treat the WebSocket session as disposable. Reconnect on network failure, server close, idle timeout, and process restart.| Event | Action |
|---|---|
Connection fails before open | Retry with backoff. Do not send audio until the connection is open. |
| Connection closes unexpectedly | Stop sending audio, preserve buffered audio, reconnect, then resume streaming. |
| Error message received | Log errors[].code, errors[].title, and errors[].source.parameter. Reconnect only after fixing parameter errors. |
| Graceful shutdown | Send {"type": "CloseStream"} and wait for final transcripts before closing the socket. |
Backoff
Use bounded exponential backoff with jitter.| Attempt | Base delay |
|---|---|
| 1 | 250 ms |
| 2 | 500 ms |
| 3 | 1 s |
| 4 | 2 s |
| 5+ | 5 s max |
Partials
Enableinterim_results=true when the application needs live captions or low-latency UI updates.
| Message | Handling |
|---|---|
is_final: false | Display as temporary text. Replace it when a newer partial arrives. Do not persist it as final transcript. |
is_final: true | Commit to the transcript. Do not replace it with later partials. |
utterance_end: true | Treat as a segment boundary. Do not render an empty transcript as text. |
Audio Buffering
Buffer audio at the producer boundary, not inside the WebSocket send loop.| Control | Recommendation |
|---|---|
| Chunk size | Send 2048-8192 byte binary frames. |
| Queue size | Set a maximum buffered duration, such as 5-10 seconds. |
| Backpressure | Pause or drop low-priority audio when the queue is full. |
| Reconnect | Keep a short rolling buffer only if retranscription after reconnect is required. |
Keepalive
For Deepgram sessions, send{"type": "KeepAlive"} during long silence periods. Keep sending audio as binary frames when audio is available.
For other engines, use the WebSocket client’s ping/pong support when available and reconnect on missed heartbeats.
Monitoring
Track connection, latency, transcript, and buffer metrics.| Metric | Purpose |
|---|---|
| Connection attempts | Detect retry loops and regional network issues. |
| Connection duration | Detect unstable sessions and idle timeout patterns. |
| Close code and reason | Separate expected closes from failures. |
| Error codes | Identify invalid parameters and engine compatibility issues. |
| Audio queue depth | Detect send-loop backpressure. |
| Partial-to-final latency | Measure caption freshness. |
| Final transcript count | Detect stalled recognition. |
| Empty final count | Detect silence segmentation behavior. |
transcription_engine, model, input_format, sample_rate, and interim_results value with each session. Redact API keys and user audio.
Shutdown
Use graceful shutdown for planned stops.- Drain the audio queue.
- Send
{"type": "CloseStream"}. - Wait for final transcript messages.
- Close the WebSocket.