Krux

April 2, 2026
PrismML's 1-Bit Models Fit an 8B AI in 1.15 GB
Published: April 2, 2026 at 12:29 AM
Updated: April 2, 2026 at 12:29 AM
100-word summary
PrismML claims to have built the first commercially viable 1-bit AI models, compressing an 8-billion-parameter model into just 1.15 GB. That's roughly the size of a single high-res photo. The company released three open-source Bonsai models under Apache 2.0 that run at 130+ tokens per second on an iPhone 17 Pro Max. Every component uses 1-bit weights, not just some layers, which drops energy use and memory by orders of magnitude versus standard models. The catch: PrismML is the only source for these performance claims, and 1-bit quantization is unproven territory. If the benchmarks hold up, your smartwatch could soon run the kind of AI that today requires a server.
What happened
PrismML claims to have built the first commercially viable 1-bit AI models, compressing an 8-billion-parameter model into just 1.15 GB. That's roughly the size of a single high-res photo. The company released three open-source Bonsai models under Apache 2.0 that run at 130+ tokens per second on an iPhone 17 Pro Max. Every component uses 1-bit weights, not just some layers, which drops energy use and memory by orders of magnitude versus standard models. The catch: PrismML is the only source for these performance claims, and 1-bit quantization is unproven territory.
Why it matters
If the benchmarks hold up, your smartwatch could soon run the kind of AI that today requires a server.