Skip to main content
This page answers common customer questions about where data is processed and stored, which providers are used, retention, and model training. Telnyx AI spans two products that handle processing location differently:
  • Inference API — chat completions, the responses endpoint, and related model APIs.
  • Voice AI Assistants — telephony-based conversational agents.
The single most important thing to understand:
Telnyx offers hard controls for data at rest (storage location and retention), but does not offer hard controls for processing location. Where a request is processed is latency-based and best-effort — it is influenced, not guaranteed. Under failover or capacity events, processing shifts to the next-best region rather than failing.
This FAQ describes how the product works technically. It is not a legal commitment. Contractual terms — including data processing terms, training opt-outs, and any region commitments — are handled through your account team, a Data Processing Agreement (DPA), and applicable Telnyx terms. For written confirmations, contact support or your Telnyx account manager.

Processing vs. storage: the key distinction

Processing in transit (best-effort, not guaranteed)Storage at rest (hard control)
Inference APILatency-based, influenced by the ingress domain you call (api.telnyx.com, api.telnyx.eu, api.telnyx.com.au). Not tied to data locality. Not guaranteed.Chat completions: not stored. Responses endpoint (stores conversations): governed by your data locality setting.
Voice AI AssistantsInfluenced by the anchorsite on the assistant’s TeXML application. Best-effort, not guaranteed.Governed by your data locality flag, plus the data-retention setting for conversation content.
Data Locality governs storage at rest for covered data types. Neither the data locality flag nor the anchorsite is a hard guarantee of where live processing happens.

Inference API

Where is Inference processing performed?

Inference processing in transit is latency-based and best-effort, influenced by the ingress domain you call, not by your data locality setting:
Ingress domainPreferred region
api.telnyx.comUS
api.telnyx.euEU
api.telnyx.com.auAPAC
Calling a regional ingress domain (for example, api.telnyx.eu) directs requests to the nearest GPU region for that domain on a best-effort basis. Telnyx does not guarantee the processing location: during failover or capacity events, requests are processed at the next-lowest-latency region rather than failing. See Inference Regions & Availability for the underlying GPU regions.

Does Inference store my data?

It depends on the endpoint:
  • Chat completions endpointdoes not store request or response data.
  • Responses endpointstores conversations. For stored data, your Data Locality setting dictates the storage region.

Can Inference traffic be pinned to a specific region?

Not as a hard guarantee. Routing is latency-based and best-effort: calling a regional ingress domain (for example, api.telnyx.eu) directs requests to that region under normal conditions, but Telnyx does not guarantee processing location. During failover or capacity events, requests are processed at the next-lowest-latency region rather than failing. If you have a strict compliance requirement for guaranteed processing location, contact support to discuss what is possible for your account.

Voice AI Assistants

Where is Voice AI Assistant processing performed?

For Voice AI Assistants, processing location is influenced by the anchorsite configured on the assistant’s TeXML application — not by the data locality flag. Setting the anchorsite (for example, Frankfurt for the EU) directs media/processing to that region under normal conditions. The anchorsite is best-effort, not a hard control. Telnyx does not guarantee processing location: under failover or capacity events, processing can shift to another region rather than failing the call.

Where is Voice AI Assistant data stored?

Storage location at rest is a hard control, governed by your Data Locality flag. Retention of conversation content is further controlled by the data-retention setting (see Data retention). Recording storage can also be directed to your own storage destination, which Telnyx respects.

Are call audio, transcripts, prompts, responses, summaries, or recordings ever handled outside the configured region?

  • Processing location is influenced by the assistant’s anchorsite, but is best-effort and not guaranteed.
  • Storage at rest is a hard control, following your data locality flag. Recordings can be directed to a customer-controlled storage destination, which Telnyx respects.
Telnyx does not contractually guarantee blanket “EU-only processing.” Processing controls are best-effort only, “processing” is defined very broadly, and some components — for example, third-party STT/TTS providers, or operational/security/fraud handling — may involve activity outside a single region. The specifics depend on the providers and features you enable. Confirm written data commitments with your account team and DPA before making representations to your own customers.

Example: EU-focused Voice AI setup

A typical EU-oriented configuration combines:
  • Data locality: EU (Germany) — a hard control over storage at rest
  • Anchorsite on the TeXML app: an EU site (for example, Frankfurt) — best-effort influence over media/processing location
  • Voice API endpoint: api.telnyx.eu
  • SIP endpoint: sip.telnyx.eu
This keeps storage in the EU (a hard control via data locality) and steers processing toward the EU (best-effort via the anchorsite). STT/TTS provider choice also matters — some providers are self-hosted by Telnyx and some are third parties (see below).

STT, TTS, and LLM providers (Voice AI)

For Voice AI Assistants, the STT, TTS, and LLM providers in use depend on the models and voices you select. Some are self-hosted by Telnyx (run on Telnyx-operated infrastructure); others are third-party services that Telnyx integrates with. This distinction matters for compliance: self-hosted models keep that processing step within Telnyx infrastructure, whereas third-party models route that step to the vendor.
Hosting (self-hosted vs. third-party) is about which infrastructure performs the step, not a guarantee of region. Processing region is best-effort for all providers — see the processing vs. storage note above.

Speech-to-text (STT)

ModelProviderHosting
deepgram/fluxDeepgramSelf-hosted by Telnyx
deepgram/nova-3DeepgramSelf-hosted by Telnyx
deepgram/nova-2DeepgramSelf-hosted by Telnyx
assemblyai/universal-streamingAssemblyAISelf-hosted by Telnyx
speechmatics/standardSpeechmaticsSelf-hosted by Telnyx
distil-whisper/distil-large-v2Whisper (English-only)Self-hosted by Telnyx
azure/fastAzureThird-party
soniox/stt-rt-v4SonioxThird-party
xai/grok-sttxAIThird-party

Text-to-speech (TTS)

TTS is delivered through Telnyx’s TTS gateway, which integrates multiple providers. The provider depends on the voice you select:
ProviderHosting
Telnyx (in-house voices, including Telnyx Ultra)Self-hosted by Telnyx
RimeSelf-hosted by Telnyx
ResembleSelf-hosted by Telnyx
ElevenLabsThird-party
AWSThird-party
AzureThird-party
MinimaxThird-party
InworldThird-party
xAIThird-party
See Text to Speech voices for the current voice catalog.

Large language model (LLM)

The assistant’s model is served through Telnyx’s inference platform. The model in use is the one you configure on the assistant. Self-hosted by Telnyx (open models served on Telnyx infrastructure) include the Qwen and Moonshot (Kimi) model families — for example, Qwen/Qwen3-235B-A22B, moonshotai/Kimi-K2.5, and moonshotai/Kimi-K2.6. Third-party models — including those from Anthropic (Claude), OpenAI (GPT), and Google (Gemini) — are not self-hosted. When you select one of these, the prompt is sent to that external provider to generate the response. The available models evolve over time — for the current catalog and which models are recommended for assistants, see Models.
If data residency or third-party data sharing is a concern, choose a self-hosted model (a Qwen or Moonshot/Kimi model) to keep prompt and response generation on Telnyx infrastructure. Region remains best-effort even for self-hosted models.

Can STT, TTS, or LLM processing be restricted to the EU?

There is no hard guarantee of processing region for any provider — processing is best-effort. In addition:
  • Self-hosted providers keep that processing step on Telnyx infrastructure, but region remains best-effort.
  • Third-party providers route that step to the vendor, whose own region behavior applies.
If you need STT, TTS, or LLM processing constrained to a specific region, contact support so we can advise which self-hosted provider/model combinations best fit your requirement. Hard region guarantees are not offered for processing.

Recordings

Are call recordings disabled by default?

No — for Voice AI Assistants, call recordings are enabled by default, and you can turn them off. When recordings are enabled, the recording is stored as Media Storage, which is subject to your Data Locality setting. Disable recording on the assistant (or per call) if you do not want recordings retained.

Data retention and model training

What does the data-retention setting control?

Voice AI Assistants expose a data-retention privacy setting (privacy_settings.data_retention). It is enabled by default. When you disable it, the assistant stops persisting conversation content while continuing the minimum processing needed to run and bill the call. When data_retention is disabled, conversation content is not retained:
ItemBehavior when retention is off
Conversation messages / transcriptsNot persisted to the conversations store
InsightsNot retained. An insight may be computed transiently in-memory to support live conversation behavior, but the conversation and its insights are not stored
Transcript and assistant answer in observability logsNot retained; replaced with placeholders (for example, [transcript not available] / [answer not available])
LLM request/response content loggingDisabled
TTS cacheDisabled, so synthesized audio is not cached
A limited set of records is still retained even when conversation retention is off, because they are required to operate and bill the service:
ItemBehavior when retention is off
Latency / timing metricsRetained (timing only, no conversation content)
Billing, security, and fraud-prevention recordsRetained as required for legitimate business and compliance purposes
The data-retention flag governs retention of conversation content for Voice AI Assistants. Disabling it stops persistence of conversation content and insights; it does not change where data that is retained lives — storage region is controlled by Data Locality. Recordings are governed separately by the recording setting (see Recordings above). For a guarantee tailored to your exact configuration (audio, tool inputs/outputs, memory, observability traces, and third-party provider logs), confirm in writing with your account team and DPA.

Can a customer opt out of model improvement / training / evaluation?

Customer data handling for model training is governed by Telnyx’s applicable terms and DPA. If you require an opt-out from model improvement, training, or evaluation — for both input and output data, and covering Telnyx and any third-party AI providers in your configuration — contact your account team to confirm the governing terms and document the opt-out.

Usage reporting and billing

Can usage be broken down by assistant, phone number, or metadata/tag?

Usage and conversation data can be attributed using identifiers such as the assistant, the associated phone number, and metadata. For subscriber-level or per-tag billing breakdowns, contact support to confirm which dimensions are available and how to structure metadata/tags for clean attribution. See Agent Observability and Session Analysis.