JavaScript

import Telnyx from 'telnyx';

const client = new Telnyx({
  apiKey: process.env['TELNYX_API_KEY'], // This is the default and can be omitted
});

const response = await client.textToSpeech.generate();

console.log(response.base64_audio);

"<string>"

Generate speech from text

Generate synthesized speech audio from text input. Returns audio in the requested format (binary audio stream, base64-encoded JSON, or an audio URL for later retrieval).

Authentication is provided via the standard Authorization: Bearer <API_KEY> header.

The voice parameter provides a convenient shorthand to specify provider, model, and voice in a single string (e.g. telnyx.NaturalHD.Alloy). Alternatively, specify provider explicitly along with provider-specific parameters.

Supported providers: aws, telnyx, azure, elevenlabs, minimax, rime, resemble.

POST

text-to-speech

JavaScript

import Telnyx from 'telnyx';

const client = new Telnyx({
  apiKey: process.env['TELNYX_API_KEY'], // This is the default and can be omitted
});

const response = await client.textToSpeech.generate();

console.log(response.base64_audio);

"<string>"

Authorizations

Authorization

string

header

required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json

Request body for generating speech from text.

voice

string

Voice identifier in the format provider.model_id.voice_id or provider.voice_id. Examples: telnyx.NaturalHD.Alloy, azure.en-US-AvaMultilingualNeural, aws.Polly.Generative.Lucia. When provided, provider, model_id, and voice_id are extracted automatically and take precedence over individual parameters.

text

string

The text to convert to speech.

provider

enum<string>

TTS provider. Required unless voice is provided.

Available options:

aws,

telnyx,

azure,

elevenlabs,

minimax,

rime,

resemble

language

string

Language code (e.g. en-US). Usage varies by provider.

text_type

enum<string>

Text type. Use ssml for SSML-formatted input (supported by AWS and Azure).

Available options:

text,

ssml

output_type

enum<string>

default:binary_output

Determines the response format. binary_output returns raw audio bytes, base64_output returns base64-encoded audio in JSON.

Available options:

binary_output,

base64_output

disable_cache

boolean

default:false

When true, bypass the audio cache and generate fresh audio.

voice_settings

object

Provider-specific voice settings. Contents vary by provider — see provider-specific parameter objects below.

aws

object

AWS Polly provider-specific parameters.

Show child attributes

telnyx

object

Telnyx provider-specific parameters.

Show child attributes

azure

object

Azure Cognitive Services provider-specific parameters.

Show child attributes

elevenlabs

object

ElevenLabs provider-specific parameters.

Show child attributes

minimax

object

Minimax provider-specific parameters.

Show child attributes

rime

object

Rime provider-specific parameters.

Show child attributes

resemble

object

Resemble AI provider-specific parameters.

Show child attributes

Response

Speech generated successfully. The response format depends on the output_type parameter:

binary_output (default): Returns raw audio bytes with the appropriate Content-Type header (e.g. audio/mpeg).
base64_output: Returns a JSON object with base64_audio field.

Raw audio bytes. Returned when output_type is binary_output (default).

List available voices Speech to text over websocket

⌘I

Voice API

Standalone TTS & STT

TeXML

SIP Trunking

WebRTC

Generate speech from text

Authorizations

Body

Response