SUTRA-MULTILINGUAL
Perform LLM completion with streaming response in OpenAI format.
Request payload is a CompletionParams
object. Returns OpenAI chunks in a server-sent-event (SSE) stream.
POST
/
v2
/
chat
/
completions
Authorizations
Authorization
string
headerrequiredBearer authentication header of the form Bearer <token>
, where <token>
is your auth token.
Body
application/json
model
enum<string>
requiredA model name, for example 'sutra-light'.
Available options:
sutra-light
, sutra-pro
, sutra-online
messages
object[]
requiredThe LLM prompt.
max_tokens
number
The maximum number of tokens to generate before terminating. This number cannot exceed the context window for the selected model. The default value is 1024.
temperature
number
Controls the randomness of the response, a lower temperature gives lower randomness. Values are in the range [0,2] with a default value of 0.3.
stop
object
May be a string, null or an array of strings.
presence_penalty
number
frequency_penalty
number
top_p
number
extra_body
object
Response
200 - text/event-stream; charset=utf-8
A server-sent event (SSE) conforming to OpenAI format.