Design a Voice from a Prompt

What voice design does

Voice design generates a synthetic voice from a natural language description. You describe what you want — age, tone, accent, pacing — and the AI creates audio samples that match. This is not voice cloning. There’s no source audio. The voice is generated from scratch based on your text prompt.

The two-step flow: design → clone

The API has two separate resources:

Voice Design — an intermediate artifact. Think of it as a draft. You can iterate on it (up to 50 versions per design). It is NOT usable for TTS directly.
Voice Clone — a production-ready voice. Created from a design. This is what you pass to AI Assistants, Call Control, and the TTS API.

POST /v2/voice_designs → generates a sample → returns design id + version
POST /v2/voice_clones  → saves the design as a usable voice → returns voice clone id

The portal hides this two-step flow behind a single “Save This Voice” button. If you’re using the API directly, you need both steps.

Voice Design

Design a Voice — Quickstart

⌘I

Design a Voice

Clone from Audio

Using Custom Voices

For AI Agents

Design a Voice from a Prompt

What voice design does

The two-step flow: design → clone

​What voice design does

​The two-step flow: design → clone

What voice design does

The two-step flow: design → clone