Skip to main content
Voice format: xAI.Grok.<voice> Five expressive voices with 20+ language support and auto-detected language detection.
Higher latency — Grok voices have higher latency than Ultra. For latency-sensitive applications requiring sub-100ms TTFB, use Ultra.
Grok is the second Expressive Mode provider alongside Ultra, supporting inline speech tags for pauses, laughter, whispers, and emphasis.

Voices

VoiceDescription
xAI.Grok.AraAra — distinct character and tone
xAI.Grok.EveEve — distinct character and tone
xAI.Grok.LeoLeo — distinct character and tone
xAI.Grok.RexRex — distinct character and tone
xAI.Grok.SalSal — distinct character and tone

Expressive Mode

Grok voices support Expressive Mode, letting AI agents dynamically adjust tone and emotion during live conversations. The AI model controls emotional delivery using inline speech tags for pauses, laughter, whispers, and emphasis — without hard-coding emotions into prompts. To enable Expressive Mode, toggle it on in your assistant’s Voice settings.

Language Support

Grok voices support 20+ languages with auto-detected language support and BCP-47 code support for consistent output.

On-Network Processing

Grok voices run through Telnyx-hosted models, keeping audio and inference on the same private backbone.

Getting Started

  1. Go to Mission Control → AI → Assistants → select your assistant → Voice tab.
  2. Select an xAI Grok voice (Ara, Eve, Leo, Rex, or Sal).
  3. Toggle Expressive Mode on.
  4. Save your assistant.