Skip to main content
Voice format: inworld.<Model>.<VoiceId> Models: inworld-tts-1.5-mini (alias Mini — faster) and inworld-tts-1.5-max (alias Max — higher quality). Defaults to mini.

Voice Samples

VoiceModelGenderSample
Inworld.Max.HankMaxMale
Inworld.Mini.LorettaMiniFemale

WebSocket

Query Parameters

ParameterTypeDefaultDescription
audio_formatstringmp3mp3, linear16.
sample_rateinteger240008000, 16000, 22050, 24000, 44100, 48000.
languagestringBCP-47 language code.

Voice Settings

FieldTypeDefaultDescription
encodingstringMP3MP3 or LINEAR16.
sample_rateinteger24000Output sample rate in Hz.
language_codestringBCP-47. Overrides language query param.
{
  "text": " ",
  "voice_settings": {
    "encoding": "LINEAR16",
    "sample_rate": 16000
  }
}

REST API

Fields

FieldTypeDefaultDescription
encodingstringMP3MP3 or LINEAR16.
sample_rateinteger24000Output sample rate in Hz.
language_codestringBCP-47 language code.
output_typestringbinary_outputbinary_output, base64_output, or audio_id.

Response

Default (binary_output): chunked audio bytes. With output_type: "base64_output": JSON with base64-encoded audio. With output_type: "audio_id": JSON with an audio_url for deferred retrieval.