Krux

April 2, 2026
Meta's Ad Engine Now Runs Trillion-Parameter Models in Milliseconds
Published: April 2, 2026 at 12:29 AM
Updated: April 2, 2026 at 12:29 AM
100-word summary
Meta just brought LLM-sized AI to a place where every millisecond counts: ad auctions. The company's new Adaptive Ranking Model, already live on Instagram since late 2025, routes each impression to the right-sized model in real time, keeping response times under a second even when juggling trillion-parameter models. Instead of running one massive model for everyone, it sizes up each request and picks accordingly. The payoff? Your Instagram ads can now analyze your entire scrolling history before deciding what to show you, not just your last few taps. Meta had to redesign how it spreads calculations across GPUs and compress data on the fly to pull it off. The surprise:...
What happened
Meta just brought LLM-sized AI to a place where every millisecond counts: ad auctions. The company's new Adaptive Ranking Model, already live on Instagram since late 2025, routes each impression to the right-sized model in real time, keeping response times under a second even when juggling trillion-parameter models. Instead of running one massive model for everyone, it sizes up each request and picks accordingly. The payoff? Your Instagram ads can now analyze your entire scrolling history before deciding what to show you, not just your last few taps. Meta had to redesign how it spreads calculations across GPUs and compress data on the fly to pull it off.
Why it matters
The surprise: the hardest part of AI isn't training anymore, it's serving predictions fast enough that users never notice the math happening behind their feed.