Google researchers have created an AI tool that can generate photorealistic images based on text input. The researchers called their tool “Imagen” and reported that people find results more realistic than the creations of the comparable tool, DALL-E 2 from OpenAI.
Based on the description in the text Imagen can generate images† You can choose from ‘oil painting’ or photo-realistic. The latter is more difficult to work convincingly with artificial intelligence. Imagen excels at this, the makers say.
Imagen runs on the basis of a large pre-trained language model, such as GPT-3. This model is “frozen,” according to the researchers with the best results. Then the text entry is marked with a . sign diffusion model Convert it from random noise to an image.
Initially, Imagen creates a small image of 64 x 64 pixels. Immediately Ultra-fine diffusion model It is then scaled up to a final result of 1024 x 1024 pixels. Thus an AI tool can create compelling non-existent images based on sentences like “Dragon fruit wearing a karate belt in the snow” And “Picture of a raccoon wearing an astronaut’s helmet looking out the window at night†
The researchers published a paper with Explanation of the work of Imagine† In it, they also compare their AI tool with other image-generating tools. According to researchers, people prefer Imagen’s creations.
Imagen is not the first artificial intelligence tool that can create images based on text input. OpenAI previously came with DALL-E 2† According to the makers, this is a tool that can create realistic images and art based on text. DALL-E can also make two different versions of existing artwork.
“Professional web ninja. Certified gamer. Avid zombie geek. Hipster-friendly baconaholic.”