Krux

May 8, 2026
OpenAI Built a Network Protocol to Stop GPU Hiccups
Published: May 8, 2026 at 12:19 AM
Updated: May 8, 2026 at 12:19 AM
100-word summary
OpenAI released MRC, a new network protocol designed to keep thousands of GPUs training in sync without choking on network failures. The system sprays packets across hundreds of paths and reroutes around breakdowns in microseconds. Before, a single network glitch could kill an entire training run. OpenAI already runs MRC on its massive NVIDIA GB200 clusters and published the spec to the Open Compute Project so others can adopt it. The catch? It needs hardware support baked into 800Gb/s network cards. As AI companies race to train on ever-larger GPU farms, the pipes connecting them matter as much as the chips themselves.
What happened
OpenAI released MRC, a new network protocol designed to keep thousands of GPUs training in sync without choking on network failures. The system sprays packets across hundreds of paths and reroutes around breakdowns in microseconds. Before, a single network glitch could kill an entire training run. OpenAI already runs MRC on its massive NVIDIA GB200 clusters and published the spec to the Open Compute Project so others can adopt it. The catch? It needs hardware support baked into 800Gb/s network cards.
Why it matters
As AI companies race to train on ever-larger GPU farms, the pipes connecting them matter as much as the chips themselves.