Google pivots to agentic AI with Gemini 3.5 Flash and unified Omni model
Google’s latest release prioritises speed and cost-efficiency for coding and agent workflows, introducing Gemini 3.5 Flash, a 24/7 Spark agent for subscribers, and an Omni model designed to replace its Veo video generator.

Google has officially launched Gemini 3.5 Flash, a new model optimised for agentic artificial intelligence tasks, marking a strategic pivot towards autonomous agents capable of complex, multi-step work. Released during the company’s annual I/O developer conference, the model delivers output speeds of nearly 300 tokens per second, a rate significantly faster than larger frontier models such as Gemini 3.1 Pro, while maintaining comparable benchmark scores. This efficiency is central to Google’s strategy to make complex agentic experiences viable at scale, addressing the high costs associated with running generative AI for extended periods.
The model has been rolled out across Google’s ecosystem, including the Gemini app, API, AI Studio, Android Studio, and enterprise products. Internal metrics cited by Tulsee Doshi, senior director of product management for Gemini, indicate a substantial improvement in coding performance for Googlers using 3.5 Flash compared to previous iterations. The model shows measurable gains on Terminal Bench and SWE-Bench Pro tests, outperforming older Flash versions and matching the performance of OpenAI’s GPT 5.5 on coding and general computing tasks within the OSWorld-Verified benchmark.
To support these agentic capabilities, Google is upgrading its Antigravity IDE to version 2.0, enabling support for parallel agentic workflows. This update allows the system to spawn multiple sub-agents simultaneously, a feat Google attributes to the new model’s token output efficiency. Doshi noted that insights from developer usage, particularly within Antigravity, have been critical in refining the model’s code generation and tool-use performance, aiming to reduce the friction of models interacting with human-designed user interfaces.
In a move to bring agentic capabilities to consumers, Google introduced Gemini Spark, a dedicated agent available to AI Ultra subscribers starting next week. Operating continuously in the cloud, Spark functions independently of specific devices or browser tabs, drawing context from a user’s Google ecosystem to execute tasks such as email monitoring, meeting summarisation, and data analysis. The service is part of a revised subscription structure, with a new Ultra tier priced at $100 per month, down from the previous $200 tier, which retains higher token limits.
Alongside the Flash model, Google unveiled Omni Flash, a new multimodal model designed to accept various input types, including text, images, video, and audio, to produce corresponding outputs. Currently limited to video generation, Omni Flash is replacing the Veo model in the Gemini app, Google Flow, and YouTube Shorts. While Omni represents a step toward a unified multimodal future, Google acknowledges that the technology is still in early stages, with plans to evaluate whether specific use cases might benefit from custom models rather than a single all-encompassing system.


