LLMs

Google launches Gemini 3.5 models and Omni multimodal family at I/O 2026

Google unveiled Gemini 3.5 Flash as the new default model, introduced Gemini Omni for text-to-video generation, and previewed always-on agents powered by Gemini Spark.

Last verified:

Gemini Omni: Multimodal video from mixed inputs

Google’s most significant announcement at I/O 2026 is the introduction of Gemini Omni, a new model family that generates video from multiple input modalities simultaneously. According to The Verge AI, Omni Flash—the first release in this family—is now available in the Gemini app, Google Flow, and YouTube Shorts.

Unlike Google’s prior Veo system, which accepts only text-to-video prompts, Omni Flash accepts text, photographs, video clips, and audio as input to produce video output. This architectural flexibility directly addresses a gap in Google’s competitive positioning: OpenAI’s Canvas and Anthropic’s Claude have demonstrated demand for agents that can ingest and remix multimodal source material. The Verge reports that Google envisions future Omni versions capable of “create[ing] anything from any input,” signaling ambitions to build a general-purpose content-synthesis engine rather than a constrained text-to-media pipeline.

The rollout cadence—immediate availability in consumer products rather than a slow preview—suggests Google’s confidence in the model’s quality and its urgency to capture the video-generation market before competitors consolidate it.

Gemini 3.5 Flash replaces base model; Pro follows in June

Effective May 19, Gemini 3.5 Flash is now the default model powering the Gemini app and AI Mode in Google Search. According to The Verge, this iteration excels at agent-like operations requiring planning and iterative problem-solving, a capability that will drive adoption among teams using Search-integrated reasoning workflows.

The model also includes stricter content-safety mechanisms. The Verge reports that Gemini 3.5 Flash reduces false-positive safety blocks—queries flagged as unsafe when they are in fact benign—while lowering the model’s propensity to generate harmful content. This represents a tuning shift toward reducing user friction without sacrificing safety posture, a trade-off that benefits high-volume consumer applications like Search.

Gemini 3.5 Pro, the higher-capacity variant, launches in June 2026. Google has not disclosed Pro’s parameter count, context window, or benchmark scores relative to Flash, leaving the performance gap to be established upon release. The staggered launch—Flash now, Pro deferred—prioritizes rapid deployment of the efficient model over the powerful one, inverting the typical “launch flagship first” strategy and indicating Google’s confidence that Flash-tier performance will satisfy the majority of deployed use cases.

Gemini Spark: 24/7 agent for workspace automation

The Verge reports that Google is introducing Gemini Spark, an always-on agent that runs continuously on Google Cloud virtual machines, powered by Gemini 3.5 Flash. Spark integrates with Google Workspace applications (Docs, Gmail, Sheets, Slides) and third-party platforms including Canva and Instacart, enabling tasks such as email composition, study-guide generation, and automated expense anomaly detection (flagging unexpected credit card charges).

This represents Google’s counter to OpenAI’s Canvas—a persistent, context-aware agent that operates across multiple SaaS boundaries. Spark’s always-on architecture is particularly notable: the agent monitors user activity and Workspace events asynchronously, rather than responding only to explicit prompts. The Verge notes that Google plans to expand Spark’s capabilities to access local files stored on macOS, a capability that will extend Spark’s context window to include user documents outside the cloud.

Android app generation in AI Studio with live emulation

Google is enabling developers to generate native Android applications using natural-language prompts in AI Studio, with embedded Android emulation for preview and edit cycles. According to The Verge, generated apps can be exported to Android Studio, GitHub, or downloaded as ZIP archives. Publishing directly to the Google Play Store is supported; Google also plans to allow private publication for friends and family, with Firebase integration coming later.

This tooling reduces the activation energy for app prototyping and democratizes development for non-professional builders, a segment OpenAI is targeting with its own code-generation and low-code offerings.

Gemini app redesign with haptic feedback

The Verge reports that Google is rolling out a visual redesign for the Gemini app and web interface, effective May 19 across Android, iOS, and web. The redesign—called “neural expressive” by Google—introduces animated transitions, color accents, typography updates, and haptic feedback on mobile devices. While cosmetic, the redesign signals investment in user retention and perceived model responsiveness, differentiating Gemini’s tactile experience from competitors like ChatGPT.

Why This Matters

For enterprise productivity teams, Spark’s always-on integration with Workspace and third-party SaaS reduces the friction of adopting AI automation—agents no longer require explicit invocation or context-window engineering. Teams evaluating vendor lock-in between Google Workspace and Microsoft 365 now have a concrete agent-enabled capability exclusive to Google’s stack.

For video-generation feature teams, Omni’s multimodal input architecture signals that single-modality models (text-only input, video-only output) are becoming insufficient. Competing vendors—including OpenAI, Runway, and startups—will need to demonstrate similar input flexibility or risk losing market share in the creator and marketing segments.

For Google Search’s competitive position, Flash’s improved agentic reasoning and reduced false-positive safety blocks will drive A/B testing improvements in search-result ranking. If independent benchmarks confirm Flash’s agent-handling gains, enterprise search tools and RAG pipelines will face pressure to adopt Gemini in place of GPT-4 or Claude Opus 4, a shift that would increase Google Cloud consumption.

The June release of Gemini 3.5 Pro will be the first test of whether Google’s staggered-release strategy succeeded—if Pro demonstrates only marginal gains over Flash, the efficient model may cannibalize demand for the premium tier.

Frequently Asked Questions

What is Gemini 3.5 Flash and when is it available?

Gemini 3.5 Flash is Google's updated base model released on May 19, 2026. It is now the default model in the Gemini app and Google's AI Mode in Search. Gemini 3.5 Pro will follow in June 2026.

How does Gemini Omni differ from Veo?

Omni Flash accepts multiple input types—text, images, video, and audio—to generate video, whereas Veo (Google's prior system) only accepts text prompts. Google plans for future Omni versions to 'create anything from any input.'

What is Gemini Spark and how does it work?

Spark is an always-on agent powered by Gemini 3.5 Flash that runs 24/7 on Google Cloud virtual machines. It integrates with Workspace apps (Docs, Gmail, Sheets, Slides) and third-party services (Canva, Instacart) to automate tasks like email drafting and expense monitoring.

Can I publish apps made with Google's AI Studio?

Yes. Users can publish vibe-coded Android apps directly to the Play Store from AI Studio. Google also plans to offer private publication for friends and family, and will add Firebase integration support later.

#gemini #google-io-2026 #multimodal #video-generation #agents