Introduction

SUTRA are multilingual and cost-efficient Generative AI models with conversation, search, and visual capabilities trained and developed by TWO. SUTRA is a Sanskrit word signifying a thread or aphorism, encapsulating concise wisdom.

SUTRA models are fine-tuned and aligned with proprietary conversational data, ensuring coherent and consistent dialogs. Surpassing other leading LLMs on multilingual performance by 20-30% on the MMLU benchmark, SUTRA models excel in comprehending and generating responses across numerous languages. SUTRA models are online LLMs that can use knowledge from the internet to provide hallucination-free, factual and up-to-date responses. Purpose-built multilingual tokenizers lead to 5X to 8X greater efficiency and considerable cost-savings during generation. Furthermore, SUTRA models can generate responses with TTFB between 100 - 200 ms and 1.3x-5x higher token throughput compared to prominent models, enhancing overall performance and usability

SUTRA models come in multiple flavors:

  • SUTRA-PRO: Our most advanced language model ideal for nuanced multilingual conversations in more than 50 languages.

  • SUTRA-LIGHT: These models are ideal for multilingual conversations able to converse in more than 50 languages.

  • SUTRA-ONLINE: Online language models that can use knowledge from the internet to provide the most up-to-date and factual responses.

  • SUTRA-AVATAR: Model that brings the power of generative ML algorithms to enable real-time lifelike visual and voice synthesis.

Why Choose SUTRA?

We’ve built LLMs to tackle major issues in existing models like hallucinated responses, lack of up-to-date information, English-centricity, high cost and latency, and inefficient generation. Our training involves a proprietary multi-language dataset, tokenizers, and models covering languages such as English, Hindi, Tamil, and Korean, supported by our advanced AI infrastructure to cater to our global audience.

  1. LLMs to support natural conversations across languages, with leading performance on MMLU benchmark, beyond English
  2. Online LLMs providing factual and up-to-date responses, using live data from the internet
  3. Fine-tuned models & tokenizers with proprietary conversational data.
  4. Cost-effective deployment at scale with providing fast responses with high throughput (~120 tokens / second)

Drawbacks of Current LLMs

Leading LLMs like Llama-3, GPT-3.5/4, and Mistral have gained popularity in recent years, but these models are trained on extensive English text datasets, and most LLM architectures are designed with an English-centric approach. As a result, while LLMs perform well in English, their effectiveness diminishes significantly in other languages, unable to capture nuances and cultural context, reverting to English, struggling with complex multilingual tasks, or failing catastrophically. Moreover, these large, English-centric models are inefficient and unsuitable for specific non-English use cases.