Google's Gemma 4 Brings 256K Context to Your Phone

April 3, 2026

Google's Gemma 4 Brings 256K Context to Your Phone

Published: April 3, 2026 at 12:38 AM

Updated: April 3, 2026 at 12:38 AM

100-word summary

Google just released Gemma 4, a family of open models designed to run entirely on your device. The twist? Even the smallest phone-optimized variants pack 128K token context windows, while cloud versions stretch to 256K. That's enough to feed an entire codebase or research paper into a model running offline on Android. The Apache 2.0 license means anyone can download, customize, and ship these models without licensing fees or vendor approval. Four variants range from 2B-parameter edge chips to 31B dense models, all with native function calling. Google is betting developers want sophisticated AI that doesn't phone home. Your calendar app could reason about your week without sending data to...

What happened

Google just released Gemma 4, a family of open models designed to run entirely on your device. The twist? Even the smallest phone-optimized variants pack 128K token context windows, while cloud versions stretch to 256K. That's enough to feed an entire codebase or research paper into a model running offline on Android. The Apache 2.0 license means anyone can download, customize, and ship these models without licensing fees or vendor approval. Four variants range from 2B-parameter edge chips to 31B dense models, all with native function calling. Google is betting developers want sophisticated AI that doesn't phone home.

Why it matters

Your calendar app could reason about your week without sending data to the cloud.

Sources