Google released Gemini 2.5 Flash Image (nicknamed nano-banana), its newest image generation and editing model. The system introduces several upgrades over earlier Flash models, including character consistency across prompts, multi-image fusion, precise prompt-based editing, and integration of world knowledge for semantic understanding.
The release is part of Google’s Gemini 2.5 family, which extends the Flash line of models beyond text and into image generation. Gemini 2.0 Flash was mainly recognized for its speed and efficiency, but its image generation features were limited in quality and editing precision. Gemini 2.5 Flash Image introduces improvements in these areas, adding tools that make it more practical for both quick experiments and structured creative workflows.
One technical focus of Gemini 2.5 Flash Image is character consistency, a common difficulty in generative models. It is designed to keep the same subject recognizable across multiple prompts or edits—for example, when moving a character between scenes, showing a product from different perspectives, or producing standardized visual assets.
The model also supports prompt-based image editing, where users can describe specific changes in natural language. Typical operations include background adjustments, object removal or replacement, or modifying details such as a subject’s pose. In addition, a multi-image fusion capability allows features from several inputs to be combined into a single result.
Gemini 2.5 Flash Image also benefits from world knowledge integration, giving it an edge in scenarios that require semantic reasoning. Google has demonstrated examples such as reading and interpreting hand-drawn diagrams, adapting templates for real estate listings, and assisting with educational tasks that combine visual and textual understanding.
Industrial designer Thomas Broen shared his first impressions after testing the model:
I found it interesting how good it was at editing your own images. Like adding features, editing the background/foreground, etc. But also that it was able to ‘go back to the original image’ when it was asked. Something I find that ChatGPT sometimes struggles with.
The model builds on the low latency and efficiency of Gemini 2.0 Flash, while directly incorporating community feedback for higher-quality outputs and stronger editing control. It is available now in preview through the Gemini API, Google AI Studio, and Vertex AI, with full stability expected in the coming weeks. To make experimentation easier, Google has updated Google AI Studio’s build mode with new template applications.
Pricing has been confirmed at $30 per 1 million output tokens, with each image costing about $0.039. Other modalities follow Gemini 2.5 Flash pricing.