SUTRA-MULTILINGUAL
Perform LLM completion with streaming response in OpenAI format.
Request payload is a CompletionParams
object. Returns OpenAI chunks in a server-sent-event (SSE) stream.
POST
Authorizations
Bearer authentication header of the form Bearer <token>
, where <token>
is your auth token.
Body
application/json
User input for a chat completion request.
The LLM prompt.
A model name, for example 'sutra-light'.
Available options:
sutra-light
, sutra-pro
, sutra-online
The maximum number of tokens to generate before terminating. This number cannot exceed the context window for the selected model. The default value is 1024.
May be a string, null or an array of strings.
Controls the randomness of the response, a lower temperature gives lower randomness. Values are in the range [0,2] with a default value of 0.3.
Response
200 - text/event-stream; charset=utf-8
A server-sent event (SSE) conforming to OpenAI format.