OpenAI Built a Network Protocol to Stop GPU Hiccups

May 8, 2026

OpenAI Built a Network Protocol to Stop GPU Hiccups

Published: May 8, 2026 at 12:19 AM

Updated: May 8, 2026 at 12:19 AM

100-word summary

OpenAI released MRC, a new network protocol designed to keep thousands of GPUs training in sync without choking on network failures. The system sprays packets across hundreds of paths and reroutes around breakdowns in microseconds. Before, a single network glitch could kill an entire training run. OpenAI already runs MRC on its massive NVIDIA GB200 clusters and published the spec to the Open Compute Project so others can adopt it. The catch? It needs hardware support baked into 800Gb/s network cards. As AI companies race to train on ever-larger GPU farms, the pipes connecting them matter as much as the chips themselves.

What happened

OpenAI released MRC, a new network protocol designed to keep thousands of GPUs training in sync without choking on network failures. The system sprays packets across hundreds of paths and reroutes around breakdowns in microseconds. Before, a single network glitch could kill an entire training run. OpenAI already runs MRC on its massive NVIDIA GB200 clusters and published the spec to the Open Compute Project so others can adopt it. The catch? It needs hardware support baked into 800Gb/s network cards.

Why it matters

As AI companies race to train on ever-larger GPU farms, the pipes connecting them matter as much as the chips themselves.

Sources