India Open-Sources 22-Language AI Models Trained from Scratch

March 9, 2026

India Open-Sources 22-Language AI Models Trained from Scratch

Published: March 9, 2026 at 12:30 AM

Updated: March 9, 2026 at 12:30 AM

100-word summary

Sarvam just released two reasoning models with weights freely available under Apache 2.0. The 30B and 105B models were trained from scratch in India and work across 22 Indian languages, not just English translations tacked on later. The smaller model handles 32,000 tokens of context, the larger one 128,000. That's enough to analyze entire codebases or legal documents in Hindi, Tamil, or Bengali. Both use Mixture-of-Experts architecture to stay efficient: the 30B activates just 1 billion parameters per query despite its size. India is making a sovereign AI bet, building homegrown alternatives to reduce dependence on Western models that barely understand its linguistic diversity.

What happened

Sarvam just released two reasoning models with weights freely available under Apache 2.0. The 30B and 105B models were trained from scratch in India and work across 22 Indian languages, not just English translations tacked on later. The smaller model handles 32,000 tokens of context, the larger one 128,000. That's enough to analyze entire codebases or legal documents in Hindi, Tamil, or Bengali. Both use Mixture-of-Experts architecture to stay efficient: the 30B activates just 1 billion parameters per query despite its size.

Why it matters

India is making a sovereign AI bet, building homegrown alternatives to reduce dependence on Western models that barely understand its linguistic diversity.

Sources