Skip to main content

Start

Overview

The <Start> verb is used to begin media operations during a call. This includes starting a recording session or transcription service for the call leg. It can be inserted inside a <Dial> block to start these operations on the connected call, or independently to start live services before a connection.

You can use multiple <Start> verbs to begin different operations in parallel.

Syntax

<Start media="recording|transcription"
direction="inbound|outbound|both"
format="wav|mp3"
channels="1|2"
sample_rate="8000|16000"
transcript_url="url" />

Attributes

AttributeTypeRequiredDefaultDescription
mediastring✔️Type of media to start: recording or transcription.
directionstringbothAudio direction: inbound, outbound, or both.
formatstringwavRecording format (wav, mp3, etc.).
channelsinteger1Number of audio channels: 1 (mono) or 2 (stereo).
sample_rateinteger8000Audio sample rate in Hz.
transcript_urlURIURL to receive transcription webhooks. Only used when media="transcription".

Example

<Response>
<Dial caller="18005551234">
<Connect to="+15556667777" />
<Start media="recording" format="mp3" channels="2" sample_rate="16000" />
<Start media="transcription" transcript_url="https://example.com/api/transcripts" />
</Dial>
</Response>

Behavior

  • When media="recording", audio is captured according to the provided parameters (format, direction, etc.).
  • When media="transcription", the call's audio is streamed to a speech-to-text engine and results are POSTed to the transcript_url.
  • A <Start> verb should follow a successful <Connect> if targeting the bridged call.
  • Supports multiple <Start> verbs within a <Dial>.

Webhook payloads

Recording metadata includes:

{
"recording_url": "https://.../recording.mp3",
"duration": 84,
"format": "mp3",
"start_time": "2025-07-17T15:00:00Z",
"end_time": "2025-07-17T15:01:24Z"
}

Transcription events:

  • transcription_started
  • transcription_partial
  • transcription_completed

Error handling

  • Invalid media types are ignored.
  • If transcript_url is missing for transcription, events are not delivered.
  • Errors are logged silently to avoid interrupting the call flow.

Notes

  • Use after a <Connect> verb to capture bridged audio.
  • Supports layered actions — for example, record and transcribe the same call.
  • Do not rely on transcription accuracy for compliance use cases.