Google's Gemini 3.1 Pro Doubles Its Reasoning Score

Published: March 1, 2026 at 2:12 PM

Updated: March 1, 2026 at 2:12 PM

100-word summary

Google released Gemini 3.1 Pro on February 19, claiming it more than doubled its predecessor's reasoning benchmark (77.1% on ARC-AGI-2 versus the previous generation). The upgrade brings a 1-million-token context window and costs around $2 per million input tokens. Translation: the model can now digest an entire codebase or book in one go and chain together multi-step tasks with fewer hallucinations. It's rolling out across Google's developer tools, Vertex AI for enterprises, and consumer apps like NotebookLM. The catch? Google emphasizes it still needs human oversight, a reminder that smarter models don't mean foolproof ones.

What happened

Why it matters

Google emphasizes it still needs human oversight, a reminder that smarter models don't mean foolproof ones.

Sources

Google Yahoo Finance The Verge Ars Technica