Perform chat completion with streaming response.

POST

sutra-light

completion

Authorizations

authorization

string

header

required

Body

application/json

User input for a completion request.

messages

object[]

required

The LLM prompt.

model

enum<string>

required

A model name, for example 'sutra-light'.

Available options:

sutra-light,

sutra-pro,

sutra-turbo

frequency_penalty

number

max_tokens

number

The maximum number of tokens to generate before terminating. This number cannot exceed the context window for the selected model. The default value is 1024.

presence_penalty

number

stop

object

May be a string, null or an array of strings.

temperature

number

Controls the randomness of the response, a lower temperature gives lower randomness. Values are in the range [0,2] with a default value of 0.3.

top_p

number

Response

200 - application/x-ndjson

A chunk of JSON objects.

typeName

enum<string>

required

Available options:

LLMChunk,

LLMReply

isFinal

boolean

Indicates if this is the final chunk.

API Documentation

SUTRA-MULTILINGUAL

SUTRA-ONLINE

SCHEMA

Perform chat completion with streaming response.

Authorizations

Body

Response