SUTRA-MULTILINGUAL
Perform LLM completion with streaming response in OpenAI format.
Request payload is a CompletionParams
object. Returns OpenAI chunks in a server-sent-event (SSE) stream.
POST
/
v2
/
chat
/
completions
Authorizations
Bearer authentication header of the form Bearer <token>
, where <token>
is your auth token.
Body
application/json
User input for a chat completion request.
A model name, for example 'sutra-light'.
Available options:
sutra-light
, sutra-pro
, sutra-online
The LLM prompt.
The maximum number of tokens to generate before terminating. This number cannot exceed the context window for the selected model. The default value is 1024.
Controls the randomness of the response, a lower temperature gives lower randomness. Values are in the range [0,2] with a default value of 0.3.
May be a string, null or an array of strings.
Response
200 - text/event-stream; charset=utf-8
A server-sent event (SSE) conforming to OpenAI format.