Google launches Gemini 2.5 Deep Think - surpassing OpenAI o3 and Grok 4 in performance

Today, Google officially deployed the Gemini 2.5 Deep Think model on the Gemini app for Google AI Ultra subscribers. This new version is announced to have performance that beats both OpenAI o3 and xAI Grok 4 in many important benchmarks.

From I/O 2025 to the final version

At the Google I/O 2025 event in May, Google first introduced Gemini 2.5 Pro Deep Think mode - which uses new research techniques to analyze multiple hypotheses before coming up with an answer. Today's release is an upgrade with improvements based on feedback from internal testers and recent research breakthroughs. Google claims this is a big step forward from the previous demo version.

Google revealed that Deep Think is a variant of the model that won a gold medal at the International Mathematical Olympiad (IMO) 2025. However, to suit everyday needs, they have optimized for faster speed - which means that the performance on the IMO 2025 benchmark drops to bronze medal level.

According to the benchmark table (illustrative image), Gemini 2.5 Deep Think shows outstanding advantages on many tests:

LiveCodeBench V6
Humanity's Last Exam
IMO 2025
AIME 2025

images 1 of Google launches Gemini 2.5 Deep Think - surpassing OpenAI o3 and Grok 4 in performance

Google AI Ultra users can now enable Deep Think directly in the Gemini app by selecting the option that appears in the prompt (with a limited number of uses per day). Google says this mode automatically integrates tools like code execution and Google Search.

In the coming weeks, Google plans to open the Gemini API to a trusted developer group, including versions of Deep Think with and without tooling support.

The big question now is: Will OpenAI's upcoming GPT-5 dethrone Gemini 2.5 Deep Think? The competition between tech giants has never been more exciting!