Krux

May 10, 2026
New AI Model Runs on 12.5% of Its Brain
Published: May 10, 2026 at 12:13 AM
Updated: May 10, 2026 at 12:13 AM
100-word summary
AllenAI released EMO, a model designed to work with just 16 of its 128 internal expert modules active, losing only 3% accuracy. The trick is training the model from scratch to be modular, not bolting modularity on later. You could run just the math and code experts for a coding assistant, or health experts for a medical chatbot, cutting memory needs by 87% compared to the full model. It's trained on a trillion tokens with 14 billion total parameters. The full model, training code, and a baseline comparison are public on Hugging Face now. Sparse models just got practical enough to fit on hardware you might actually have.
What happened
AllenAI released EMO, a model designed to work with just 16 of its 128 internal expert modules active, losing only 3% accuracy. The trick is training the model from scratch to be modular, not bolting modularity on later. You could run just the math and code experts for a coding assistant, or health experts for a medical chatbot, cutting memory needs by 87% compared to the full model. It's trained on a trillion tokens with 14 billion total parameters. The full model, training code, and a baseline comparison are public on Hugging Face now.
Why it matters
Sparse models just got practical enough to fit on hardware you might actually have.