Krux

April 24, 2026
Google Splits TPUs Into Two Chips for Training and Inference
Published: April 24, 2026 at 12:20 AM
Updated: April 24, 2026 at 12:20 AM
100-word summary
Google's eighth-generation TPUs break from tradition by splitting into two specialized chips: one for training models, another for running them. The training chip can link 9,600 processors with 2 petabytes of shared memory, potentially shrinking model development from months to weeks. The inference chip is built for AI agents that need to coordinate in real time with minimal lag. Both arrive later this year, though pricing remains under wraps. The split signals a bet that the future of AI isn't just bigger models, but swarms of specialized agents that need fundamentally different hardware to train versus run.
What happened
Google's eighth-generation TPUs break from tradition by splitting into two specialized chips: one for training models, another for running them. The training chip can link 9,600 processors with 2 petabytes of shared memory, potentially shrinking model development from months to weeks. The inference chip is built for AI agents that need to coordinate in real time with minimal lag. Both arrive later this year, though pricing remains under wraps.
Why it matters
The split signals a bet that the future of AI isn't just bigger models, but swarms of specialized agents that need fundamentally different hardware to train versus run.