Google Halts Gemini's Image Creator "We'll Do Better"

Cory Wright
Feb 25, 2024
2 min read

Updated: Mar 6, 2024

Just weeks after releasing Gemini to the public, replacing their Bard system, Google has halted the image generation feature saying "We got it wrong" and " We'll do better".

Gemini 1.0 and the much anticipated Gemini Ultra, aka Gemini Advanced for some reason, was announced just a few weeks ago. Boasting not only advanced, multi-modal text generation with text, image, and voice input, but also image generation akin to ChatGPT 4.0 and DallE-3. In fact, Google claimed the Ultra (Advanced for some reason) model performed better than ChatGPT 4.0.

The natural language image Generation built right in to the chat, was decent. While the results were not on the level of DallE or MidJourney, they weren't bad. But, there must have been a problem.

All of the sudden, image generation stopped working. When asked to create an image, Gemini would respond with:

"I’m still learning to create images so I can’t help you with that yet."

With no in-chat explanationof why.

But Google did have an explanation:

Three weeks ago, we launched a new image generation feature for the Gemini conversational app (formerly known as Bard), which included the ability to create images of people.

It’s clear that this feature missed the mark. Some of the images generated are inaccurate or even offensive. We’re grateful for users’ feedback and are sorry the feature didn't work well.

We’ve acknowledged the mistake and temporarily paused image generation of people in Gemini while we work on an improved version.

Apparently, images created of people were not showing what users were requesting. In many cases only showing one ethnicity when not specifying one, or historically inaccurate images.

This, according to Google, was due to the model being too cautious when creating images. Initially trained to not show sensitive or hurtful imagery, or images of real people like celebrities, the model ended up being a bit too sensitive.

So what went wrong? In short, two things. First, our tuning to ensure that Gemini showed a range of people failed to account for cases that should clearly not show a range. And second, over time, the model became way more cautious than we intended and refused to answer certain prompts entirely — wrongly interpreting some very anodyne prompts as sensitive.

These two things led the model to overcompensate in some cases, and be over-conservative in others, leading to images that were embarrassing and wrong.

There has been no statement when the feature will return, or if their newly announced Gemini 1.5 suffers from the same problems.

Google’s response mainly focused on images of people, but it seems that any request to create any image fails.

Read all about it here.

NOTE: As of March 4th, it looks like Google has reimplemented Image creation.