Skip to main content

Inference Stream

POST 

/ai/generate_stream

Generate text using an available LLM (Large Language Model) with streaming of tokens as they are generated. This allows for a faster response time and a better user experience overall as the user does not have to wait for the full output to complete before receiving a response.

Request

Body

required

    text string[]required
    model object

    Provide a namespace to our supported models.

    bucket Bucket
    openai_api_key Openai Api Key
    max_tokens Max Tokens

    Default value: 128

    temperature Temperature

    Default value: 0.9

Responses

200: Successful Response

Schema

    data

    object

    required

    token string

422: Validation Error

Schema

    detail

    object[]

  • Array [

  • loc

    object[]

    required

  • Array [

  • anyOf

    string

  • ]

  • msg Messagerequired
    type Error Typerequired
  • ]

Loading...