Skip to main content

Introduction

In this tutorial, you will learn how to get a Text-To-Speech service on your calls using Voice API and TeXML.

Before starting, please ensure your Voice API or TeXML application is correctly configured.

Video Tutorial

Watch this comprehensive video demonstration to see Text-to-Speech features in action:

This video demonstrates how to use Telnyx's Text-to-Speech capabilities to create dynamic voice interactions in your applications.

Telnyx internal Text-to-Speech engine

Telnyx provides a high-quality, low-latency Text-to-Speech (TTS) engine, offering a seamless experience for integrating speech synthesis into your calls. The Telnyx TTS engine ensures a clear and natural-sounding voice, making it an excellent choice for real-time voice applications.

You can request Telnyx TTS for your calls using the Voice API. Below is an example of how to trigger speech synthesis with Telnyx TTS:

curl --location 'https://api.telnyx.com/v2/calls/v3:6MytEd1c56mFmXlAziof4tQd-eqOgwQqpFAvECu1gBRrvD5rmsclfg/actions/speak' \
--header 'Content-Type: application/json' \
--header 'Authorization: your_api_key' \
--data '{
"payload": "The text that should be said on the call",
"voice": "Telnyx.KokoroTTS.af"
}'

You can integrate Telnyx TTS into TeXML scripts using the following format:

<Response>
<Say voice="Telnyx.KokoroTTS.af">The text that should be said on the call!</Say>
</Response>

With its high-quality voices and low latency, Telnyx TTS is an excellent choice for users seeking to integrate natural-sounding speech into their applications.

Telnyx Natural

Telnyx Natural voices provide enhanced speech quality with improved naturalness and clarity. These voices offer a significant upgrade from basic text-to-speech options, delivering more human-like speech patterns and better pronunciation accuracy.

You can request Telnyx Natural voices for your calls using the Voice API:

curl --location 'https://api.telnyx.com/v2/calls/v3:6MytEd1c56mFmXlAziof4tQd-eqOgwQqpFAvECu1gBRrvD5rmsclfg/actions/speak' \
--header 'Content-Type: application/json' \
--header 'Authorization: your_api_key' \
--data '{
"payload": "The text that should be said on the call",
"voice": "Telnyx.Natural.abbie"
}'

You can integrate Telnyx Natural voices into TeXML scripts using the following format:

<Response>
<Say voice="Telnyx.Natural.abbie">The text that should be said on the call!</Say>
</Response>

Telnyx NaturalHD

Telnyx NaturalHD voices deliver premium-quality speech synthesis with exceptional clarity and richness. These high-definition voices are ideal for applications where audio quality is critical, such as customer service, media production, or premium user experiences.

You can request Telnyx NaturalHD voices for your calls using the Voice API:

curl --location 'https://api.telnyx.com/v2/calls/v3:6MytEd1c56mFmXlAziof4tQd-eqOgwQqpFAvECu1gBRrvD5rmsclfg/actions/speak' \
--header 'Content-Type: application/json' \
--header 'Authorization: your_api_key' \
--data '{
"payload": "The text that should be said on the call",
"voice": "Telnyx.NaturalHD.andersen_johan"
}'

You can integrate Telnyx NaturalHD voices into TeXML scripts using the following format:

<Response>
<Say voice="Telnyx.NaturalHD.andersen_johan">The text that should be said on the call!</Say>
</Response>

AWS Polly

Telnyx offers both levels of quality for AWS Polly Text-To-Speech services: neural and standard. The list of voices can be found under the link.

It can be requested on the call using the Voice API command similar to:

curl --location 'https://api.telnyx.com/v2/calls/v3:6MytEd1c56mFmXlAziof4tQd-eqOgwQqpFAvECu1gBRrvD5rmsclfg/actions/speak' \
--header 'Content-Type: application/json' \
--header 'Authorization: your_api_key' \
--data '{
"payload": "The text that should be said on the call",
"voice": "Polly.Brian" || "Polly.Amy-Neural"
}'

It should be used in the following way from TeXML script:

<Response>
<Say voice="Polly.Amy-Neural">The text that should be said on the call!</Say>
</Response>

The neural voice can be used by adding the prefix to the voice name - Polly.*-Neural

Before you use it, please take a look at the price list under the link.

Azure AI Speech

Telnyx supports Azure AI Speech as a text-to-speech provider. You can find the list of supported voices and languages at the following link.

To use Azure AI Speech, the process is the same as with AWS Polly. Voices should be specified using the following format: Azure.en-CA-ClaraNeural.

Azure AI Speech supports two service levels via Telnyx:

  • Neural

    • These voices use deep neural networks to generate highly natural and expressive speech.
    • Ideal for most general applications, they offer high-quality output with support for SSML to customize pronunciation, pitch, rate, and more.
    • Example: Azure.en-CA-ClaraNeural
  • Neural HD (High Definition)

    • HD voices deliver enhanced clarity and richness for scenarios where audio quality is critical—such as media production or premium customer engagement.
    • These voices provide finer prosody control, improved phonetic detail, and natural pauses, yielding more lifelike speech.
    • Example: en-US-Emma:DragonHDLatestNeural

Here’s an example of using the Telnyx Voice API to synthesize speech with Azure AI Speech:

curl --location 'https://api.telnyx.com/v2/calls/v3:6MytEd1c56mFmXlAziof4tQd-eqOgwQqpFAvECu1gBRrvD5rmsclfg/actions/speak' \
--header 'Content-Type: application/json' \
--header 'Authorization: your_api_key' \
--data '{
"payload": "The text that should be said on the call",
"voice": "Azure.en-CA-ClaraNeural"
}'

The corresponding TeXML script would look like this:

<Response>
<Say voice="Azure.en-CA-ClaraNeural">The text that should be said on the call!</Say>
</Response>

ElevenLabs

Users get many voice options with ElevenLabs; however, response latency may exceed what you’d see from AWS Polly or Azure AI Speech.

To use the integration, you must provide an API key to your ElevenLabs account.

Telnyx offers to store it in a secure storage. The API key can be saved in the following way:

curl --location 'https://api.telnyx.com/v2/integration_secrets' \
--header 'Content-Type: application/json' \
--header 'Authorization: ••••••' \
--data '{
"identifier":"your_api_key_ref",
"value":"api_key"
}'

The speak command should look as follows for the Voice API application:

curl --location 'https://api.telnyx.com/v2/calls/v3:6MytEd1c56mFmXlAziof4tQd-eqOgwQqpFAvECu1gBRrvD5rmsclfg/actions/speak' \
--header 'Content-Type: application/json' \
--header 'Authorization: your_api_key' \
--data '{
"payload": "The text that should said on the call",
"voice": "ElevenLabs.Default.cgSgspJ2msm6clMCkdW9",
"voice_settings": {"api_key_ref": "your_api_key_ref"}
}'

A similar effect can be achieved from TeXML using the following script:

<Response>
<Say voice="ElevenLabs.Default.cgSgspJ2msm6clMCkdW9" api_key_ref="your_api_key_ref">The text that should said on the call!</Say>
</Response>

Please note: only a premium ElevenLabs can be used for the integration. The freemium account is not supported