Google’s Gemini 3.1 Pro is mostly great

The New Stack

by Frederic Lardinois

February 19, 2026

AI-Generated Deep Dive Summary

Google has launched Gemini 3.1 Pro, an updated version of its AI model that shows significant improvements in solving complex problems compared to its predecessors. While not the best at every task, the new model excels in reasoning benchmarks, performing notably better than previous generations. For instance, it scores 77.1% on the ARC-AGI-2 benchmark, a substantial increase from Gemini Pro's earlier 31.1%. This performance surpasses competitors like Anthropic's Opus 4.6 (68.8%) and OpenAI's GPT-5.2 (52.9%). However, Gemini 3.1 Pro falls short in the GDPval-AA benchmark, scoring 1317 points compared to Anthropic's Sonnet 4.6 at 1633 points. In coding benchmarks, Gemini 3.1 Pro leads the competition in most tests, including Terminal-Bench 2.0 for agentic coding. While OpenAI's Codex model reports higher scores using its own framework, Gemini 3.1 Pro still holds its ground in other metrics. The model also features a large context window of 1 million tokens and can process various media types like text, photos, videos, and audio, though its output is limited to 64,000 tokens. Pricing remains unchanged at $2/$12 per million input/output tokens, making Gemini 3.1 Pro more affordable than Anthropic's Opus 4.6, which costs $5/$25 per million tokens. The model is now widely available through Google AI Studio, Vertex AI, and other platforms, offering developers and enterprises access to a powerful tool for various applications. For those in DevOps and cloud development, Gemini 3.1 Pro's enhanced reasoning and coding capabilities make it a valuable asset for automating complex tasks. Its availability across multiple

Verticals

devopscloud

Originally published on The New Stack on 2/19/2026