Google has officially kicked off the Gemini 2.0 era with the launch of the all-new Gemini 2.0 Flash. The company claims that the Gemini 2.0 Flash even outperforms the Gemini 1.5 Pro on key benchmarks, and is also up to 2x faster than its predecessor.
As such, Gemini 2.0 Flash will become Google's flagship AI model, competing directly with offerings from OpenAI and other big names in the market. In addition to improved performance and low latency, Gemini 2.0 Flash also comes with native support for multimodal output, including natively generated images combined with controllable text-to-speech (TTS) multilingual audio and text. The advanced model also supports multimodal inputs such as images, video, and audio, and is tightly integrated with native tools, including Google Search, code execution, and more.
In simple terms, Gemini 2.0 Flash stands out for its ability to process multiple types of inputs (text, images, video, audio) to create diverse outputs (including images and voice). In the previous generation, Flash 1.5 could only create text and was not suitable for high-demand tasks. With 2.0 Flash, Google claims that this model is not only fast but also extremely flexible thanks to the ability to use tools like Google Search and connect to external APIs.
Developers can try out the Gemini 2.0 Flash beta in AI Studio and Vertex AI today. Additionally, Google is rolling out a free beta of the new Multimodal Live API with real-time audio, streaming video input, and the ability to use multiple compositing tools.
The new Gemini 2.0 Flash model will be available to users through the Gemini experience on PC, web, and soon on mobile apps. Google plans to announce general availability of Gemini 2.0 Flash in January 2025.
Along with Gemini 2.0 Flash, Google also announced a number of prototypes exploring the operational capabilities of Gemini 2.0.
With multi-modal capabilities and native tool integration, Gemini 2.0 Flash opens up exciting possibilities for both developers and users.