Documentation Index
Fetch the complete documentation index at: https://developers.telnyx.com/llms.txt
Use this file to discover all available pages before exploring further.
Use these patterns when running the standalone WebSocket STT endpoint in production.
Connection Recovery
Treat the WebSocket session as disposable. Reconnect on network failure, server close, idle timeout, and process restart.
| Event | Action |
|---|
Connection fails before open | Retry with backoff. Do not send audio until the connection is open. |
| Connection closes unexpectedly | Stop sending audio, preserve buffered audio, reconnect, then resume streaming. |
| Error message received | Log errors[].code, errors[].title, and errors[].source.parameter. Reconnect only after fixing parameter errors. |
| Graceful shutdown | Send {"type": "CloseStream"} and wait for final transcripts before closing the socket. |
Set all query parameters on every reconnect. STT configuration cannot be changed mid-session.
Backoff
Use bounded exponential backoff with jitter.
| Attempt | Base delay |
|---|
| 1 | 250 ms |
| 2 | 500 ms |
| 3 | 1 s |
| 4 | 2 s |
| 5+ | 5 s max |
Add random jitter of 0-500 ms per attempt. Reset the attempt counter after a stable connection.
Do not retry immediately on authentication or validation errors. Fix the API key, query parameters, engine, model, or format first.
Partials
Enable interim_results=true when the application needs live captions or low-latency UI updates.
| Message | Handling |
|---|
is_final: false | Display as temporary text. Replace it when a newer partial arrives. Do not persist it as final transcript. |
is_final: true | Commit to the transcript. Do not replace it with later partials. |
utterance_end: true | Treat as a segment boundary. Do not render an empty transcript as text. |
Store final transcript segments separately from the current partial. This prevents duplicate text when a final result arrives after one or more interim results.
Audio Buffering
Buffer audio at the producer boundary, not inside the WebSocket send loop.
| Control | Recommendation |
|---|
| Chunk size | Send 2048-8192 byte binary frames. |
| Queue size | Set a maximum buffered duration, such as 5-10 seconds. |
| Backpressure | Pause or drop low-priority audio when the queue is full. |
| Reconnect | Keep a short rolling buffer only if retranscription after reconnect is required. |
Avoid unbounded queues. A slow or disconnected socket should not grow memory usage indefinitely.
For live audio, prefer dropping stale buffered audio over sending it late. Late audio increases transcript delay and can make captions appear out of sync.
Keepalive
For Deepgram sessions, send {"type": "KeepAlive"} during long silence periods. Keep sending audio as binary frames when audio is available.
For other engines, use the WebSocket client’s ping/pong support when available and reconnect on missed heartbeats.
Monitoring
Track connection, latency, transcript, and buffer metrics.
| Metric | Purpose |
|---|
| Connection attempts | Detect retry loops and regional network issues. |
| Connection duration | Detect unstable sessions and idle timeout patterns. |
| Close code and reason | Separate expected closes from failures. |
| Error codes | Identify invalid parameters and engine compatibility issues. |
| Audio queue depth | Detect send-loop backpressure. |
| Partial-to-final latency | Measure caption freshness. |
| Final transcript count | Detect stalled recognition. |
| Empty final count | Detect silence segmentation behavior. |
Log the selected transcription_engine, model, input_format, sample_rate, and interim_results value with each session. Redact API keys and user audio.
Shutdown
Use graceful shutdown for planned stops.
- Drain the audio queue.
- Send
{"type": "CloseStream"}.
- Wait for final transcript messages.
- Close the WebSocket.
Set a shutdown timeout. If final messages do not arrive before the timeout, close the socket and mark the transcript as incomplete.