After it was announced earlier this year, Google has opened access to Imagen 3, its new latent diffusion model used for generating images from text prompts.
Spotted by VentureBeat, Google recently published a research paper regarding the launch of Imagen 3 in the US. The company touts Imagen 3’s methods of minimizing the potential harm of its image generation through AI models.
We introduce Imagen 3, a latent diffusion model that generates high quality images from text prompts. We describe our quality and responsibility evaluations. Imagen 3 is preferred over other state-of-the-art (SOTA) models at the time of evaluation. In addition, we discuss issues around safety and representation, as well as methods we used to minimize the potential harm of our models.
The focus on “safety and representation” manifests itself through Google’s image generator denying some prompts. This comes as Elon Musk’s xAI has launched image generation within Grok-2, which is almost completely unrestricted and has seen countless controversial images created and shared across social media.
When it was announced at Google I/O in May, Google called Imagen 3 its “highest quality” image generator to date, with improvements to how it renders text as well as limiting visual artifacts that are common among AI-generated images. Google also announced “Veo” at I/O, a generative AI video tool that has yet to launch publicly.