Gather using speak
POST/calls/:call_control_id/actions/gather_using_speak
Convert text to speech and play it on the call until the required DTMF signals are gathered to build interactive menus.
You can pass a list of valid digits along with an 'invalid_payload', which will be played back at the beginning of each prompt. Speech will be interrupted when a DTMF signal is received. The Answer
command must be issued before the gather_using_speak
command.
Expected Webhooks (see callback schema below):
call.dtmf.received
(you may receive many of these webhooks)call.gather.ended
Request
Path Parameters
Unique identifier and token for controlling the call
- application/json
Body
required
Gather using speak request
- Define voices using the format
<Provider>.<Model>.<VoiceId>
. Specifying only the provider will give default values for voice_id and model_id. - AWS: Use
AWS.Polly.<VoiceId>
(e.g.,AWS.Polly.Joanna
). For neural voices, which provide more realistic, human-like speech, append-Neural
to theVoiceId
(e.g.,AWS.Polly.Joanna-Neural
). Check the available voices for compatibility. - Azure: Use `Azure.
. (e.g. Azure.en-CA-ClaraNeural, Azure.en-CA-LiamNeural, Azure.en-US-BrianMultilingualNeural, Azure.en-US-AvaMultilingualNeural. For a complete list of voices, go to Azure Voice Gallery.) - ElevenLabs: Use
ElevenLabs.<ModelId>.<VoiceId>
(e.g.,ElevenLabs.eleven_multilingual_v2.21m00Tcm4TlvDq8ikWAM
). TheModelId
part is optional. To use ElevenLabs, you must provide your ElevenLabs API key as an integration identifier secret in"voice_settings": {"api_key_ref": "<secret_identifier>"}
. See integration secrets documentation for details. Check available voices.
The text or SSML to be converted into speech. There is a 3,000 character limit.
The text or SSML to be converted into speech when digits don't match the valid_digits
parameter or the number of digits is not between min
and max
. There is a 3,000 character limit.
Possible values: [text
, ssml
]
Default value: text
The type of the provided payload. The payload can either be plain text, or Speech Synthesis Markup Language (SSML).
Possible values: [basic
, premium
]
Default value: premium
This parameter impacts speech quality, language options and payload types. When using basic
, only the en-US
language and payload type text
are allowed.
Specifies the voice used in speech synthesis.
Supported Providers:
For service_level basic, you may define the gender of the speaker (male or female).
voice_settings
object
The settings associated with the voice selected
oneOf
The settings associated with the voice selected
Possible values: [arb
, cmn-CN
, cy-GB
, da-DK
, de-DE
, en-AU
, en-GB
, en-GB-WLS
, en-IN
, en-US
, es-ES
, es-MX
, es-US
, fr-CA
, fr-FR
, hi-IN
, is-IS
, it-IT
, ja-JP
, ko-KR
, nb-NO
, nl-NL
, pl-PL
, pt-BR
, pt-PT
, ro-RO
, ru-RU
, sv-SE
, tr-TR
]
The language you want spoken. This parameter is ignored when a Polly.*
voice is specified.
Default value: 1
The minimum number of digits to fetch. This parameter has a minimum value of 1.
Default value: 128
The maximum number of digits to fetch. This parameter has a maximum value of 128.
Default value: 3
The maximum number of times that a file should be played back if there is no input from the user on the call.
Default value: 60000
The number of milliseconds to wait for a DTMF response after speak ends before a replaying the sound file.
Default value: #
The digit used to terminate input if fewer than maximum_digits
digits have been gathered.
Default value: 0123456789#*
A list of all digits accepted as valid.
Default value: 5000
The number of milliseconds to wait for input between digits.
Use this field to add state to every subsequent webhook. It must be a valid Base-64 encoded string.
Use this field to avoid duplicate commands. Telnyx will ignore any command with the same command_id
for the same call_control_id
.
Responses
200: Successful response upon making a call control command.
- application/json
default: Unexpected error
- application/json
Request samples
curl -L 'https://api.telnyx.com/v2/calls/:call_control_id/actions/gather_using_speak' \
-H 'Content-Type: application/json' \
-H 'Accept: application/json' \
-H 'Authorization: Bearer <TOKEN>' \
-d '{
"payload": "say this on call",
"invalid_payload": "say this on call",
"payload_type": "text",
"service_level": "premium",
"voice": "male",
"language": "arb",
"minimum_digits": 1,
"maximum_digits": 10,
"terminating_digit": "#",
"valid_digits": "123",
"inter_digit_timeout_millis": 10000,
"client_state": "aGF2ZSBhIG5pY2UgZGF5ID1d",
"command_id": "891510ac-f3e4-11e8-af5b-de00688a4901"
}'
Response samples
{
"data": {
"result": "ok"
}
}
{
"errors": [
{
"code": "string",
"title": "string",
"detail": "string",
"source": {
"pointer": "string",
"parameter": "string"
},
"meta": {}
}
]
}