PrismML's 1-Bit Models Fit an 8B AI in 1.15 GB

April 2, 2026

PrismML's 1-Bit Models Fit an 8B AI in 1.15 GB

Published: April 2, 2026 at 12:29 AM

Updated: April 2, 2026 at 12:29 AM

100-word summary

PrismML claims to have built the first commercially viable 1-bit AI models, compressing an 8-billion-parameter model into just 1.15 GB. That's roughly the size of a single high-res photo. The company released three open-source Bonsai models under Apache 2.0 that run at 130+ tokens per second on an iPhone 17 Pro Max. Every component uses 1-bit weights, not just some layers, which drops energy use and memory by orders of magnitude versus standard models. The catch: PrismML is the only source for these performance claims, and 1-bit quantization is unproven territory. If the benchmarks hold up, your smartwatch could soon run the kind of AI that today requires a server.

What happened

PrismML claims to have built the first commercially viable 1-bit AI models, compressing an 8-billion-parameter model into just 1.15 GB. That's roughly the size of a single high-res photo. The company released three open-source Bonsai models under Apache 2.0 that run at 130+ tokens per second on an iPhone 17 Pro Max. Every component uses 1-bit weights, not just some layers, which drops energy use and memory by orders of magnitude versus standard models. The catch: PrismML is the only source for these performance claims, and 1-bit quantization is unproven territory.

Why it matters

If the benchmarks hold up, your smartwatch could soon run the kind of AI that today requires a server.

Sources