Krux

April 5, 2026
Google's Gemma 4 Runs AI Models Offline on Your Phone
Published: April 5, 2026 at 12:25 AM
Updated: April 5, 2026 at 12:25 AM
100-word summary
Google just released Gemma 4, a family of open-weight AI models designed to run directly on devices without an internet connection. The smallest version uses just 2 billion parameters but handles 128,000-token contexts, meaning it can process roughly 100 pages of text locally on a phone or laptop. The lineup includes models that understand audio and video, not just text. All four variants ship under Apache 2.0, letting companies run them on-premises without licensing headaches. You can generate code in your IDE, analyze video footage on a security camera, or build chatbots that work in airplane mode, all starting today on hardware ranging from Android phones to consumer GPUs.
What happened
Google just released Gemma 4, a family of open-weight AI models designed to run directly on devices without an internet connection. The smallest version uses just 2 billion parameters but handles 128,000-token contexts, meaning it can process roughly 100 pages of text locally on a phone or laptop. The lineup includes models that understand audio and video, not just text. All four variants ship under Apache 2.0, letting companies run them on-premises without licensing headaches.
Why it matters
You can generate code in your IDE, analyze video footage on a security camera, or build chatbots that work in airplane mode, all starting today on hardware ranging from Android phones to consumer GPUs.