Google Launches Gemini 2.5 Computer Use: AI Models That Superiorly Control Web and Mobile Interfaces
At Google I/O earlier this year, Google revealed plans to add 'computer use' capabilities to the Gemini API. Now, the company has officially announced Gemini 2.5 Computer Use, a specialized model designed to help AI agents interact directly with user interfaces (UI). According to Google, this new model has outperformed many leading competitors in AI performance assessments related to web and mobile app controls.
How the 'computer_use' tool works in the Gemini API
The process of using this tool includes the following steps:
- The developer sends a user request to the tool, along with a screenshot of the working environment and a history of recent operations.
- Additionally, developers can specify which types of UI actions are excluded or add custom functions if needed.
- The Gemini 2.5 Computer Use model will analyze input data and generate a corresponding response — which could be a UI action like a mouse click or text entry.
- If the model is uncertain, it will require confirmation from the end user, especially in sensitive actions like purchases.
- The client-side code then performs the specified action (like clicking a button or displaying a confirmation box).
- Once complete, the system sends back the new screenshot and current URL to the model, restarting the loop.
This cycle will repeat until the main goal requested by the user is completed.
According to Google, Gemini 2.5 Computer Use is optimized for web browsers, but also shows high performance when controlling mobile application interfaces. However, this model is not optimized for controlling interfaces at the desktop OS level.
Internal testing results show that Gemini 2.5 Computer Use achieves leading performance on many standard AI tests in the field of UI control.
Gemini 2.5 Computer Use is now in public preview. Developers can access and test the model via the Gemini API on Google AI Studio and Vertex AI.
You should read it
- Gemini 2.0 Flash Launched: Google's New Breakthrough in AI
- What is Google Gemini? How does Gemini work?
- Google is about to release Gemini AI for kids under 13
- How to use Gemini 1.5 Flash for free
- Google Launches Gemini CLI, Bringing Gemini to the Terminal
- 10 best photo creation prompts using Google Gemini's Nano Banana AI