Skip to main content

Documentation Index

Fetch the complete documentation index at: https://developers.telnyx.com/llms.txt

Use this file to discover all available pages before exploring further.

Use these patterns when running the standalone WebSocket STT endpoint in production.

Connection Recovery

Treat the WebSocket session as disposable. Reconnect on network failure, server close, idle timeout, and process restart.
EventAction
Connection fails before openRetry with backoff. Do not send audio until the connection is open.
Connection closes unexpectedlyStop sending audio, preserve buffered audio, reconnect, then resume streaming.
Error message receivedLog errors[].code, errors[].title, and errors[].source.parameter. Reconnect only after fixing parameter errors.
Graceful shutdownSend {"type": "CloseStream"} and wait for final transcripts before closing the socket.
Set all query parameters on every reconnect. STT configuration cannot be changed mid-session.

Backoff

Use bounded exponential backoff with jitter.
AttemptBase delay
1250 ms
2500 ms
31 s
42 s
5+5 s max
Add random jitter of 0-500 ms per attempt. Reset the attempt counter after a stable connection. Do not retry immediately on authentication or validation errors. Fix the API key, query parameters, engine, model, or format first.

Partials

Enable interim_results=true when the application needs live captions or low-latency UI updates.
MessageHandling
is_final: falseDisplay as temporary text. Replace it when a newer partial arrives. Do not persist it as final transcript.
is_final: trueCommit to the transcript. Do not replace it with later partials.
utterance_end: trueTreat as a segment boundary. Do not render an empty transcript as text.
Store final transcript segments separately from the current partial. This prevents duplicate text when a final result arrives after one or more interim results.

Audio Buffering

Buffer audio at the producer boundary, not inside the WebSocket send loop.
ControlRecommendation
Chunk sizeSend 2048-8192 byte binary frames.
Queue sizeSet a maximum buffered duration, such as 5-10 seconds.
BackpressurePause or drop low-priority audio when the queue is full.
ReconnectKeep a short rolling buffer only if retranscription after reconnect is required.
Avoid unbounded queues. A slow or disconnected socket should not grow memory usage indefinitely. For live audio, prefer dropping stale buffered audio over sending it late. Late audio increases transcript delay and can make captions appear out of sync.

Keepalive

For Deepgram sessions, send {"type": "KeepAlive"} during long silence periods. Keep sending audio as binary frames when audio is available. For other engines, use the WebSocket client’s ping/pong support when available and reconnect on missed heartbeats.

Monitoring

Track connection, latency, transcript, and buffer metrics.
MetricPurpose
Connection attemptsDetect retry loops and regional network issues.
Connection durationDetect unstable sessions and idle timeout patterns.
Close code and reasonSeparate expected closes from failures.
Error codesIdentify invalid parameters and engine compatibility issues.
Audio queue depthDetect send-loop backpressure.
Partial-to-final latencyMeasure caption freshness.
Final transcript countDetect stalled recognition.
Empty final countDetect silence segmentation behavior.
Log the selected transcription_engine, model, input_format, sample_rate, and interim_results value with each session. Redact API keys and user audio.

Shutdown

Use graceful shutdown for planned stops.
  1. Drain the audio queue.
  2. Send {"type": "CloseStream"}.
  3. Wait for final transcript messages.
  4. Close the WebSocket.
Set a shutdown timeout. If final messages do not arrive before the timeout, close the socket and mark the transcript as incomplete.