Google's Gemma 4 Runs AI Models Offline on Your Phone

April 5, 2026

Google's Gemma 4 Runs AI Models Offline on Your Phone

Published: April 5, 2026 at 12:25 AM

Updated: April 5, 2026 at 12:25 AM

100-word summary

Google just released Gemma 4, a family of open-weight AI models designed to run directly on devices without an internet connection. The smallest version uses just 2 billion parameters but handles 128,000-token contexts, meaning it can process roughly 100 pages of text locally on a phone or laptop. The lineup includes models that understand audio and video, not just text. All four variants ship under Apache 2.0, letting companies run them on-premises without licensing headaches. You can generate code in your IDE, analyze video footage on a security camera, or build chatbots that work in airplane mode, all starting today on hardware ranging from Android phones to consumer GPUs.

What happened

Google just released Gemma 4, a family of open-weight AI models designed to run directly on devices without an internet connection. The smallest version uses just 2 billion parameters but handles 128,000-token contexts, meaning it can process roughly 100 pages of text locally on a phone or laptop. The lineup includes models that understand audio and video, not just text. All four variants ship under Apache 2.0, letting companies run them on-premises without licensing headaches.

Why it matters

You can generate code in your IDE, analyze video footage on a security camera, or build chatbots that work in airplane mode, all starting today on hardware ranging from Android phones to consumer GPUs.

Sources