Google Launches Nano Banana 2 Lite, Image Generation in Just 4 Seconds

2026-07-01 13:47

Favorite

en.Wedoany.com Reported - Google recently released two new models for developers: Gemini Omni Flash and Nano Banana 2 Lite. The former deeply integrates multimodal reasoning with video generation and editing, while the latter focuses on high-speed image generation.

The Gemini Omni Flash model was unveiled at Google I/O 2026, with its core capability being the integration of Gemini's multimodal reasoning into video generation and editing workflows. The model is now available via the Gemini API and Google AI Studio. Its four key capabilities include: conversational video editing, allowing users to refine videos using natural language; multimodal referencing, which combines image, text, and video inputs to maintain scene consistency; leveraging Gemini's knowledge in areas such as history, biology, and narrative logic to construct videos; and synchronizing text with video actions through simple prompts. In terms of pricing, Omni Flash costs $0.10 per second of video output, on par with Veo 3.1 Fast.

Google also listed the current limitations of the model: it currently only supports 10-second video generation, does not support audio reference uploads or scene extension, the API supports up to 3-second videos as reference material but the model cannot yet correctly process such inputs, and character consistency during scene transitions and camera movements remains limited.

The other model, Nano Banana 2 Lite, named gemini-3.1-flash-lite-image, is optimized for latency-sensitive scenarios. Its core selling point is an image generation latency of approximately 4 seconds, one-fifth that of Nano Banana 2; generating a 1K resolution image costs about $0.034, half the cost of Nano Banana 2 and one-quarter that of Nano Banana Pro. In terms of text rendering and benchmark tests, Nano Banana 2 Lite is on par with models like Grok, making it suitable for scenarios such as batch generation of e-commerce materials and iterative ad creative development.

Google demonstrated a workflow that chains the two models together: first, use Nano Banana 2 Lite to generate images at high speed, then input the generated images as reference material into Gemini Omni Flash to convert them into videos. To this end, Google developed three demo applications: Anywhere, which composites selfies or uploaded photos onto landmark locations and generates dynamic short clips; Space Lift, which generates different renovation plans from uploaded room photos and can convert them into spatial walkthrough videos; and Omni Product Studio, which generates contextual images and short ad videos for e-commerce products.

Reference link: https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-omni-flash-nano-banana-2-lite/

America

Information and Communication Artificial Intelligence Engineering

This bulletin is compiled and reposted from information of global Internet and strategic partners, aiming to provide communication for readers. If there is any infringement or other issues, please inform us in time. We will make modifications or deletions accordingly. Unauthorized reproduction of this article is strictly prohibited. Email: news@wedoany.com

Previous：Brazilian Digital Communication Association Submits Agenda to Presidential Candidates and Supports Creation of a Digitalization Ministry

Next：Four Nuclear Power Projects Illustrate China's Nuclear Power Self-Reliance Development Path