On Tuesday, Google announced Gemini 2.5, a new family of AI's reasoning models to stop “think” before answering a question.
To kick off the new family of models, Google is launching Gemini 2.5 Pro Experimental, a multimodal, reasoning AI model claimed by the company is its most intelligent model. This model will be available on Tuesday on the company's developer platform, the Google AI Studio, as well as the Gemini app for subscribers to the company's $ 20-a-month AI plan, Gemini Advanced.
Moving forward, Google says that all new AI models will have baked reasoning capabilities.
Since Openai launched the First AI's Reasoning Model in September 2024O1, the tech industry rode to match or exceed the capabilities of that model on their own. Today, Anthropic, Deepseek, Google, and Xai all have reasoning models, which use excessive power of computing and time to check the truth and cause through problems before delivering an answer.
Reasoning methods have helped AI models achieve new heights in mathematical and coding activities. Many in the tech world believe that the reasonable models will be a key element of AI agents, autonomous systems who can perform activities that largely cover human intervention. However, these models are also more expensive.
Google has been experimenting with AI's reasoning models before, which released a “thinking” version of Gemini in December. But Gemini 2.5 represents the company's most serious attempt at the Besting series of “O” of Openai.
Google claims that the Gemini 2.5 Pro Outperforms are the previous models of Frontier AI, and some of the leading models of competing AI, on some benchmarks. In particular, Google says it designed Gemini 2.5 to be more than creating visually compelling web apps and coding agent applications.
In a review of the code measuring code editing, called Aider Polyglot, Google said Gemini's scores 2.5 Pro 68.6%, Outperforming Top AI models from Openai, Anthropic, and Chinese AI Lab DeepEek.
However, in another Dev software measurement capabilities, Swe-Bench verified, Gemini 2.5 Pro scores of 63.8%, raising the O3-Mini and Deepseek's R1, but not surprisingly anthropic Claude of 3.7 Sonnet, which scored 70.3%.
In the final test of mankind, a multimodal test consisting of thousands of audience questions related to math, humanities, and natural science, Google said Gemini's scores 2.5 Pro 18.8%, which performs better than most of the treacherous models.
To begin with, Google says Gemini 2.5 Pro sends a 1 million token context window, which means the AI model can take about 750,000 words in a single go. That's longer than the entire series of “Lord of the Rings” book. And soon, Gemini 2.5 Pro will support double the length of the input (2 million tokens).
Google did not publish API pricing for Gemini 2.5 Pro. The company said it would share more in the coming weeks.