> ## Documentation Index
> Fetch the complete documentation index at: https://developers.telnyx.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Inference API Quickstart

> Quickstart for the Telnyx Inference API. Get an API key, send your first chat completion request, and explore models for text, embeddings, and audio.

## Prerequisites

* [Telnyx account](https://telnyx.com/sign-up)
* [API Key](https://portal.telnyx.com/#/app/auth/v2)
* Python 3.8+

Install the OpenAI SDK:

```shell theme={null}
pip install openai
```

The Inference API is OpenAI-compatible. Any OpenAI SDK works with a `base_url` swap.

## Python

```python theme={null}
import os
from openai import OpenAI

client = OpenAI(
  api_key=os.getenv("TELNYX_API_KEY"),
  base_url="https://api.telnyx.com/v2/ai/openai",
)

chat_completion = client.chat.completions.create(
  messages=[
    {"role": "user", "content": "Tell me about Telnyx"}
  ],
  model="zai-org/GLM-5.1-FP8",
  stream=True
)

# GLM-5.1 is a reasoning model: it streams its thinking in `reasoning_content`
# before the final answer in `content`. Print both so you can see the reasoning.
reasoning_started = False
content_started = False
for chunk in chat_completion:
  delta = chunk.choices[0].delta
  if getattr(delta, "reasoning_content", None):
    if not reasoning_started:
      print("--- reasoning ---")
      reasoning_started = True
    print(delta.reasoning_content, end="", flush=True)
  if delta.content:
    if not content_started:
      print("\n--- answer ---")
      content_started = True
    print(delta.content, end="", flush=True)
```

<Note>
  Reasoning models such as `zai-org/GLM-5.1-FP8` return their chain-of-thought in a
  separate `reasoning_content` field (on `message` for non-streaming responses, or
  `delta` when streaming). Models without reasoning simply omit it, so the
  `getattr(..., "reasoning_content", None)` guard works for every model.
</Note>

## Core Concepts

### Messages

Chat history passed to the model.

### Roles

Every message has a role: **system**, **user**, **assistant**, or **tool**.

* **system** — model behavior instructions
* **user** — end-user input
* **assistant** — model output
* **tool** — function call results. See [Function Calling](/docs/inference/functions).

### Models

[Available Models](/docs/inference/models) lists all hosted LLMs with context lengths and capabilities.

### Streaming

Server-sent events, same as OpenAI.

## What Next?

| I want to...                    | Go to                                                                                                      |
| :------------------------------ | :--------------------------------------------------------------------------------------------------------- |
| Build a voice assistant         | [No-Code Voice Assistant](/docs/inference/ai-assistants/no-code-voice-assistant)                           |
| Call custom code from the model | [Function Calling](/docs/inference/functions) / [Streaming Functions](/docs/inference/streaming-functions) |
| Ground responses in documents   | [Embeddings](/docs/inference/embeddings)                                                                   |
| Identify themes in data         | [Clusters](/docs/inference/clusters)                                                                       |
| Migrate from OpenAI             | [OpenAI Migration](/docs/inference/openai)                                                                 |
| Browse all models               | [Available Models](/docs/inference/models)                                                                 |
