New AI Model Runs on 12.5% of Its Brain

May 10, 2026

New AI Model Runs on 12.5% of Its Brain

Published: May 10, 2026 at 12:13 AM

Updated: May 10, 2026 at 12:13 AM

100-word summary

AllenAI released EMO, a model designed to work with just 16 of its 128 internal expert modules active, losing only 3% accuracy. The trick is training the model from scratch to be modular, not bolting modularity on later. You could run just the math and code experts for a coding assistant, or health experts for a medical chatbot, cutting memory needs by 87% compared to the full model. It's trained on a trillion tokens with 14 billion total parameters. The full model, training code, and a baseline comparison are public on Hugging Face now. Sparse models just got practical enough to fit on hardware you might actually have.

What happened

AllenAI released EMO, a model designed to work with just 16 of its 128 internal expert modules active, losing only 3% accuracy. The trick is training the model from scratch to be modular, not bolting modularity on later. You could run just the math and code experts for a coding assistant, or health experts for a medical chatbot, cutting memory needs by 87% compared to the full model. It's trained on a trillion tokens with 14 billion total parameters. The full model, training code, and a baseline comparison are public on Hugging Face now.

Why it matters

Sparse models just got practical enough to fit on hardware you might actually have.

Sources