Google's Gemini 3.1 Flash-Lite Is 2.5x Faster, Half the Price

Published: March 9, 2026 at 12:29 AM

Updated: March 9, 2026 at 12:29 AM

100-word summary

Google just released Gemini 3.1 Flash-Lite, its speediest model yet. It starts answering 2.5x faster than the previous version and outputs responses 45% quicker, all while costing less: $0.25 per million input tokens versus higher rates for older models. The real trick? You can now dial how hard the model "thinks" on each task. Set it low for content moderation that needs to run thousands of times per second, crank it up when generating dashboards or simulations. The model is in preview, so real-world mileage may vary. Speed used to mean stupidity in AI. Not anymore.

Google's Gemini 3.1 Flash-Lite Is 2.5x Faster, Half the Price

100-word summary

What happened

Why it matters

Sources