How Conversation Relay works
Conversation Relay uses a single bidirectional WebSocket connection per session:- Your application provides a public
wss://WebSocket URL. - Telnyx starts Conversation Relay on the call, either from TeXML or with a Programmable Voice command.
- Telnyx opens a WebSocket connection to your application.
- Telnyx sends a
setupframe that identifies the session and call. - Telnyx sends
prompt,dtmf,interrupt, anderrorframes as call events occur. - Your application sends
text,play,sendDigits,language, orendframes back to Telnyx.
Starting Conversation Relay using TeXML
To start Conversation Relay from a TeXML application, return a<Connect> verb with a nested <ConversationRelay> verb from your TeXML voice URL.
action attribute is configured on <Connect>, not on <ConversationRelay>. It controls where Telnyx sends the action callback after the connected service stops. Your application can use that callback request to return the next TeXML instructions for the call.
The following attributes configure the relay session:
url- The WebSocket URL Telnyx connects to. This must start withws://orwss://.welcomeGreeting- Text Telnyx speaks when the relay starts.voice- The TTS voice used for generated speech.language- The default language for TTS and transcription.transcriptionProvider- The speech recognition provider.dtmfDetection- Enables DTMF detection anddtmfWebSocket frames.interruptibleandwelcomeGreetingInterruptible- Control when caller input can interrupt speech.
Language- Adds a supported language with optional per-languagevoice,ttsProvider,transcriptionProvider, andspeechModelsettings.Parameter- Sends custom key-value data to your WebSocket server in thesetupframe.
Starting Conversation Relay using Programmable Voice
You can also start Conversation Relay on an active Programmable Voice call with the Start Conversation Relay command. Use this option when your application is already controlling the call through Call Control. After the call is active, send aconversation_relay_start command with the call’s call_control_id.
url- The Conversation Relay WebSocket URL.greeting- Text Telnyx speaks when the relay starts.voice- The TTS voice used for generated speech.tts_provider- The text-to-speech provider. If omitted, Telnyx derives it fromvoiceorprovider.voice_settings- Provider-specific voice settings.language- The default language for TTS and transcription.languages- Per-language TTS and transcription settings.dtmf_detection- Enables DTMF detection.interruptibleandinterruptible_greeting- Control when caller input can interrupt speech.transcription_engineandtranscription_engine_config- Configure the speech recognition provider.custom_parameters- Key-value data forwarded to the relay session.
WebSocket process flow
When Conversation Relay starts, Telnyx opens a WebSocket connection to the configuredurl and sends a setup frame:
setup frame to initialize call-specific state in your application. After setup, your application can send text, play, sendDigits, language, or end frames back to Telnyx.
Telnyx does not reconnect automatically if the WebSocket closes. Closing the WebSocket terminates the Conversation Relay session.
Frames sent by Telnyx
Telnyx sends the following frame types to your WebSocket server.| Type | Description |
|---|---|
setup | First frame sent after the WebSocket connects. Identifies the relay session and call. |
prompt | Caller speech transcribed to text. Partial transcripts use last: false; final transcripts use last: true. |
dtmf | DTMF digit pressed by the caller. |
interrupt | Sent when the caller interrupts ongoing TTS playback. |
error | Sent when your application sends an invalid frame or another relay error occurs. |
Prompt frame
Telnyx sendsprompt frames as the caller speaks:
last: false prompts as interim transcription updates. Use last: true prompts as the final transcript for the caller’s utterance.
DTMF frame
When DTMF detection is enabled, keypad input is sent asdtmf frames:
Interrupt frame
When the caller barges in over TTS playback, Telnyx sends aninterrupt frame:
Frames sent by your application
Your WebSocket server sends the following frame types to Telnyx.| Type | Description |
|---|---|
text | Text fragment to speak back to the caller using TTS. |
play | Audio URL to play into the call. |
sendDigits | DTMF digits to send on the call. |
language | Change TTS and/or transcription language during the session. |
end | Gracefully end the Conversation Relay session. |
Sending text
Send atext frame to speak text to the caller. The text content is sent in the token field.
last: false, then send the final chunk with last: true:
If you omit
last, Telnyx treats it as false. Send last: true when the turn is complete.Playing audio
Send aplay frame to play an audio file by URL:
Sending DTMF digits
Send asendDigits frame to send DTMF digits on the call:
0-9, A-D, w or W for a pause, #, and *.
Changing language
Send alanguage frame to change TTS and/or transcription language:
ttsLanguage or transcriptionLanguage must be provided.
Ending the session
Send anend frame to end the Conversation Relay session gracefully:
Continuing the call after Conversation Relay
The<Connect> verb runs in synchronous mode. When the nested <ConversationRelay> service stops, Telnyx either continues with the next TeXML instructions in the same response or, when action is set on <Connect>, sends a request to that callback URL so your application can return the next TeXML document.
For example, you can provide a follow-up prompt after Conversation Relay ends:
<Say> and <Hangup> instructions are already present after <Connect>.
If you set action, Telnyx requests the next instructions from your callback URL:
Webhooks
When using Programmable Voice, Conversation Relay lifecycle events are delivered to your Call Control webhook URL. When the relay session ends, Telnyx sends:reason is customer_disconnect.
Error handling
If your application sends malformed or invalid frames, Telnyx sends anerror frame:
Next steps
- Start with
promptframes to react to caller speech. - Send
textframes to stream LLM responses back to the caller. - Use
dtmfandsendDigitsframes to integrate keypad-driven flows. - Use
languageframes for multilingual conversations. - Use
endwhen your application is ready to leave Conversation Relay.