OpenAI Reveals How It Keeps Voice ChatGPT Fast

Published: May 6, 2026 at 12:15 AM

Updated: May 6, 2026 at 12:15 AM

100-word summary

OpenAI published the architecture behind ChatGPT's voice mode, which now handles 900 million weekly active users. The key insight: splitting WebRTC into edge relays that forward packets and stateful transceivers that handle encryption. This lets them avoid opening a new network socket for every conversation. Traditional voice setups terminate connections at the backend, creating bottlenecks. OpenAI's design uses a tiny shared UDP relay that routes packets to whichever transceiver owns that session, then geo-steers users to the nearest entry point. The whole stack runs on Kubernetes without special snowflake infrastructure. If voice AI feels faster than video calls lately, this is why.

OpenAI Reveals How It Keeps Voice ChatGPT Fast

100-word summary

What happened

Why it matters

Sources