New AI Model Runs on 12.5% of Its Brain

Published: May 10, 2026 at 12:13 AM

Updated: May 10, 2026 at 12:13 AM

100-word summary

AllenAI released EMO, a model designed to work with just 16 of its 128 internal expert modules active, losing only 3% accuracy. The trick is training the model from scratch to be modular, not bolting modularity on later. You could run just the math and code experts for a coding assistant, or health experts for a medical chatbot, cutting memory needs by 87% compared to the full model. It's trained on a trillion tokens with 14 billion total parameters. The full model, training code, and a baseline comparison are public on Hugging Face now. Sparse models just got practical enough to fit on hardware you might actually have.

What happened

Why it matters

Sparse models just got practical enough to fit on hardware you might actually have.

Sources

Hugging Face