Krux

March 9, 2026
India Open-Sources 22-Language AI Models Trained from Scratch
Published: March 9, 2026 at 12:30 AM
Updated: March 9, 2026 at 12:30 AM
100-word summary
Sarvam just released two reasoning models with weights freely available under Apache 2.0. The 30B and 105B models were trained from scratch in India and work across 22 Indian languages, not just English translations tacked on later. The smaller model handles 32,000 tokens of context, the larger one 128,000. That's enough to analyze entire codebases or legal documents in Hindi, Tamil, or Bengali. Both use Mixture-of-Experts architecture to stay efficient: the 30B activates just 1 billion parameters per query despite its size. India is making a sovereign AI bet, building homegrown alternatives to reduce dependence on Western models that barely understand its linguistic diversity.
What happened
Sarvam just released two reasoning models with weights freely available under Apache 2.0. The 30B and 105B models were trained from scratch in India and work across 22 Indian languages, not just English translations tacked on later. The smaller model handles 32,000 tokens of context, the larger one 128,000. That's enough to analyze entire codebases or legal documents in Hindi, Tamil, or Bengali. Both use Mixture-of-Experts architecture to stay efficient: the 30B activates just 1 billion parameters per query despite its size.
Why it matters
India is making a sovereign AI bet, building homegrown alternatives to reduce dependence on Western models that barely understand its linguistic diversity.