Stream speech to text over WebSocket
AsyncAPI specification for the Telnyx Speech-to-Text WebSocket endpoint. Real-time speech transcription by streaming audio and receiving transcript frames.
Supported Engines
Azure- Microsoft Azure Speech ServicesDeepgram- Deepgram Nova modelsGoogle- Google Cloud Speech-to-TextTelnyx- Telnyx native transcription (OpenAI Whisper models)xAI- xAI Grok STTAssemblyAI- AssemblyAI Universal-StreamingSpeechmatics- Speechmatics real-time transcriptionSoniox- Soniox real-time transcription
Connection Flow
- Open WebSocket connection to
wss://api.telnyx.com/v2/speech-to-text/transcriptionwith query parameters. - Send binary audio frames (mp3 or wav format).
- Receive JSON transcript frames with
transcript,is_final, andconfidencefields. - Close connection when done.
Authentication
Requires authentication via a Bearer token (Telnyx API v2 key).
WSS
Messages
Messages