SUTRA using LiteLLM

SUTRA by TWO Platforms

SUTRA is a family of large multi-lingual language models (LMLMs) pioneered by Two Platforms. SUTRA’s dual-transformer approach extends the power of both MoE and Dense AI language model architectures, delivering cost-efficient multilingual capabilities for over 50+ languages. It powers scalable AI applications for conversation, search, and advanced reasoning, ensuring high-performance across diverse languages, domains and applications.

LiteLLM 🚅

LiteLLM simplifies access to 100+ large language models (LLMs) with a unified API. It enables easy model integration, spend tracking, rate-limiting, fallbacks, and observability—helping developers manage LLMs like OpenAI, Anthropic, Groq, Cohere, Google, and more from a single interface.

Get Your API Keys

Before you begin, make sure you have:

A SUTRA API key (Get yours at TWO AI's SUTRA API page)
Basic familiarity with Python and Jupyter notebooks

This notebook is designed to run in Google Colab, so no local Python installation is required.

Sutra using LiteLLM 🚅

###Install Requirements

# Install required packages
!pip install litellm

Setup API Keys

import os
from google.colab import userdata

# Set the API key from Colab secrets
os.environ["SUTRA_API_KEY"] = userdata.get("SUTRA_API_KEY")

Initialize Sutra Model via LiteLLM 🚅:

import os
import litellm

# Environment variable for better security
api_key = os.environ.get("SUTRA_API_KEY")

# Call the Sutra LLM via LiteLLM
response = litellm.completion(
    model="openai/sutra-v2",
    api_key=api_key,
    api_base="https://api.two.ai/v2",
    messages=[
        {
            "role": "user",
            "content": "मुझे मंगल ग्रह के बारे में 5 पैराग्राफ दीजिए"
        }
    ],
    temperature=0.7,
    max_tokens=500
)

print(response['choices'][0]['message']['content'])

Example 1: Streaming Response :

response = litellm.completion(
    model="openai/sutra-v2",
    api_key=api_key,
    api_base="https://api.two.ai/v2",
    messages=[
        {
            "role": "user",
            "content": "Explain quantum computing in simple terms"
        }
    ],
    temperature=0.8,
    max_tokens=1024,
    stream=True
)

for chunk in response:
    if chunk.choices[0].delta.content is not None:
        print(chunk.choices[0].delta.content, end='')

Example 2: Code Generation :

response = litellm.completion(
    model="openai/sutra-v2",
    api_key=api_key,
    api_base="https://api.two.ai/v2",
    messages=[
        {
            "role": "user",
            "content": "Write a Python function to find the factorial of a number using recursion"
        }
    ],
    temperature=0.5,
    max_tokens=700
)

print(response['choices'][0]['message']['content'])

Example 3: Multilingual Support :

languages = ["Hindi", "Tamil", "Bengali", "Spanish", "Arabic"]
questions = [
    "भारतीय स्वतंत्रता संग्राम के बारे में बताइए",
    "தமிழ் நாடு பற்றி எழுதுங்கள்",
    "বাংলা সাহিত্যের ইতিহাস কী?",
    "¿Qué es la inteligencia artificial?",
    "ما هو الذكاء الاصطناعي؟"
]
for lang, q in zip(languages, questions):
    print(f"\n--- {lang} ---")
    response = litellm.completion(
        model="openai/sutra-v2",
        api_key=api_key,
        api_base="https://api.two.ai/v2",
        messages=[{"role": "user", "content": q}],
        temperature=0.6,
        max_tokens=500
    )
    print(response['choices'][0]['message']['content'])

Example 4:Multilingual Translation :

translations = [
    ("English", "Hindi", "Artificial Intelligence is transforming every industry."),
    ("English", "Japanese", "The global economy is experiencing rapid changes."),
    ("English", "Arabic", "Climate change is a pressing global issue."),
    ("English", "Tamil", "Education is the foundation of a strong society."),
    ("English", "French", "Healthcare innovation is advancing at an incredible pace.")
]

for source_lang, target_lang, text in translations:
    print(f"\n=== {source_lang} → {target_lang} ===")
    prompt = f"Translate the following {source_lang} sentence into {target_lang}:\n\n{text}"

    response = litellm.completion(
        model="openai/sutra-v2",
        api_key=api_key,
        api_base="https://api.two.ai/v2",
        messages=[
            {"role": "user", "content": prompt}
        ],
        temperature=0.3,
        max_tokens=200
    )

    print("Original:", text)
    print("Translated:", response['choices'][0]['message']['content'])

Building a Simple Chatbot with LiteLLM 🚅

import os
import litellm

# Basic chat loop with history
print("Chatbot (Sutra): Hello! Type 'exit' to end the conversation.\n")

chat_history = []

while True:
    user_input = input("You: ")

    if user_input.lower() == "exit":
        print("Chatbot: Goodbye! 👋")
        break

    chat_history.append({"role": "user", "content": user_input})

    try:
        response = litellm.completion(
            model="openai/sutra-v2",
            api_key=api_key,
            api_base="https://api.two.ai/v2",
            messages=chat_history,
            temperature=0.7,
            max_tokens=500,
        )

        reply = response["choices"][0]["message"]["content"]
        print("Chatbot:", reply)

        chat_history.append({"role": "assistant", "content": reply})

    except Exception as e:
        print("Error:", str(e))

SUTRA using LiteLLM

On this page