Skip to main content

Create a chat completion

POST 
/ai/chat/completions

Chat with a language model. This endpoint is consistent with the OpenAI Chat Completions API and may be used with the OpenAI JS or Python SDK.

Request

Body

required

    messages

    object[]

    required

    A list of the previous chat messages for context.

  • Array [

  • content stringrequired
    role stringrequired

    Possible values: [system, user, assistant, tool]

  • ]

  • A list of the previous chat messages for context.

    model string

    Default value: NousResearch/Nous-Hermes-2-Mixtral-8x7B-DPO

    The language model to chat with. If you are optimizing for speed, try mistralai/Mistral-7B-Instruct-v0.1. For quality, try NousResearch/Nous-Hermes-2-Mixtral-8x7B-DPO

    stream boolean

    Whether or not to stream data-only server-sent events as they become available.

    max_tokens integer

    Maximum number of completion tokens the model should generate.

    temperature number

    Adjusts the "creativity" of the model. Lower values make the model more deterministic and repetitive, while higher values make the model more random and creative.

    guided_json object

    Must be a valid JSON schema. If specified, the output will follow the JSON schema.

    guided_regex string

    If specified, the output will follow the regex pattern.

    guided_choice string[]

    If specified, the output will be exactly one of the choices.

    response_format

    object

    Use this is you want to guarantee a JSON output without defining a schema. For control over the schema, use guided_json.

    content string

    Possible values: [text, json_object]

    min_p number

    This is an alternative to top_p that many prefer. Must be in [0, 1].

    n number

    This will return multiple choices for you instead of a single chat completion.

    tools

    object[]

    The retrieval tool type is unique to Telnyx. You may pass a list of embedded storage buckets for retrieval-augmented generation.

  • Array [

  • anyOf

  • ]

  • tool_choice string

    Possible values: [none, auto]

    use_beam_search boolean

    Setting this to true will allow the model to explore more completion options. This is not supported by OpenAI.

    best_of integer

    This is used with use_beam_search to determine how many candidate beams to explore.

    length_penalty number

    Default value: 1

    This is used with use_beam_search to prefer shorter or longer completions.

    early_stopping boolean

    This is used with use_beam_search. If true, generation stops as soon as there are best_of complete candidates; if false, a heuristic is applied and the generation stops when is it very unlikely to find better candidates.

    logprobs boolean

    Whether to return log probabilities of the output tokens or not. If true, returns the log probabilities of each output token returned in the content of message.

    top_logprobs integer

    This is used with logprobs. An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability.

    frequency_penalty number

    Default value: 0

    Higher values will penalize the model from repeating the same output tokens.

    presence_penalty number

    Default value: 0

    Higher values will penalize the model from repeating the same output tokens.

    top_p number

    An alternative or complement to temperature. This adjusts how many of the top possibilities to consider.

    openai_api_key string

    If you are using OpenAI models using our API, this is how you pass along your OpenAI API key.

Responses

200: Successful Response

422: Validation Error

Loading...