Skip to main content

Create a chat completion


Chat with a language model. This endpoint is consistent with the OpenAI Chat Completions API and may be used with the OpenAI JS or Python SDK.







    A list of the previous chat messages for context.

  • Array [

  • content




    role stringrequired

    Possible values: [system, user, assistant, tool]

  • ]

  • A list of the previous chat messages for context.

    model string

    Default value: meta-llama/Meta-Llama-3.1-8B-Instruct

    The language model to chat with. If you are optimizing for speed + price, try meta-llama/Meta-Llama-3.1-8B-Instruct. For quality, try meta-llama/Meta-Llama-3.1-70B-Instruct. Or explore our LLM Library.

    api_key_ref string

    If you are using an external inference provider like xAI or OpenAI, this field allows you to pass along a reference to your API key. After creating an integration secret for you API key, pass the secret's identifier in this field.

    stream boolean

    Whether or not to stream data-only server-sent events as they become available.

    temperature number

    Default value: 0.1

    Adjusts the "creativity" of the model. Lower values make the model more deterministic and repetitive, while higher values make the model more random and creative.

    max_tokens integer

    Maximum number of completion tokens the model should generate.



    The function tool type follows the same schema as the OpenAI Chat Completions API. The retrieval tool type is unique to Telnyx. You may pass a list of embedded storage buckets for retrieval-augmented generation.

  • Array [

  • oneOf

  • ]

  • tool_choice string

    Possible values: [none, auto, required]



    Use this is you want to guarantee a JSON output without defining a schema. For control over the schema, use guided_json.

    type stringrequired

    Possible values: [text, json_object]

    guided_json object

    Must be a valid JSON schema. If specified, the output will follow the JSON schema.

    guided_regex string

    If specified, the output will follow the regex pattern.

    guided_choice string[]

    If specified, the output will be exactly one of the choices.

    min_p number

    This is an alternative to top_p that many prefer. Must be in [0, 1].

    n number

    This will return multiple choices for you instead of a single chat completion.

    use_beam_search boolean

    Setting this to true will allow the model to explore more completion options. This is not supported by OpenAI.

    best_of integer

    This is used with use_beam_search to determine how many candidate beams to explore.

    length_penalty number

    Default value: 1

    This is used with use_beam_search to prefer shorter or longer completions.

    early_stopping boolean

    This is used with use_beam_search. If true, generation stops as soon as there are best_of complete candidates; if false, a heuristic is applied and the generation stops when is it very unlikely to find better candidates.

    logprobs boolean

    Whether to return log probabilities of the output tokens or not. If true, returns the log probabilities of each output token returned in the content of message.

    top_logprobs integer

    This is used with logprobs. An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability.

    frequency_penalty number

    Default value: 0

    Higher values will penalize the model from repeating the same output tokens.

    presence_penalty number

    Default value: 0

    Higher values will penalize the model from repeating the same output tokens.

    top_p number

    An alternative or complement to temperature. This adjusts how many of the top possibilities to consider.


200: Successful Response

422: Validation Error

Request samples

curl -L '' \
-H 'Content-Type: application/json' \
-H 'Accept: application/json' \
-H 'Authorization: Bearer <TOKEN>' \
-d '{
"messages": [
"role": "system",
"content": "You are a friendly chatbot."
"role": "user",
"content": "Hello, world!"

Response samples

"detail": [
"loc": [
"msg": "string",
"type": "string"