SUTRA vs Mistral Models
Recently, Mistral has released a series models such as Mistral-7B
, Mixtral-8x7B
with mixture of experts. They are light weight compared to closed sources models like GPT-3.5
and GPT-4
and has less than half the parameters compared to Llama-70B
. Although open-source and impressive in its generation capabilities, it still struggles for multilingual performance. More specifically, for Indian languages. At times we have noticed that Mixtral model generates repeated text in case of non-English languages and does not complete the sentences properly.
Question: “Tell me little bit about Seoul in English, Hindi, Gujarati and Korean.”
Response from SUTRA-LIGHT
:
Response from Mixtral-8x7B
: (Note how words get repeated in non-English languages leading incomprehensible text generation