Google Splits TPUs Into Two Chips for Training and Inference

Published: April 24, 2026 at 12:20 AM

Updated: April 24, 2026 at 12:20 AM

100-word summary

Google's eighth-generation TPUs break from tradition by splitting into two specialized chips: one for training models, another for running them. The training chip can link 9,600 processors with 2 petabytes of shared memory, potentially shrinking model development from months to weeks. The inference chip is built for AI agents that need to coordinate in real time with minimal lag. Both arrive later this year, though pricing remains under wraps. The split signals a bet that the future of AI isn't just bigger models, but swarms of specialized agents that need fundamentally different hardware to train versus run.

What happened

Why it matters

The split signals a bet that the future of AI isn't just bigger models, but swarms of specialized agents that need fundamentally different hardware to train versus run.

Sources

Google AI Blog