Google introduced a new technology called IMAGE, a model based on artificial intelligence (AI). create highly realistic images from short text descriptions. IMAGE can turn it into words or short descriptions like “a little cactus in a straw hat and neon sunglasses in the Sahara Desert” or “a Pomeranian dog sitting on the king’s throne with a crown and two tiger soldiers”, in pictures.
To do this, Text to Text Transformer (T5), A model introduced in 2020 that can initially combine inputs and outputs of text strings. But now, it has been modified to realize the image synthesis.
Although it is true that it is original most artificial intelligence Produces images with a resolution of 64 x 64 pixels, technology is capable of scaling them first to 256 x 256 pixels and then to 1024 x 1024 pixels.produces a cascade diffusion pattern.
One of IMAGEN’s predecessors, Deliver results with a finer level of detail than other systems Similar text-to-image conversion tools like VQ-GAN+CLIP and DALL-E 2.
For this it offersA comprehensive and rigorous benchmark test for text-to-image models” It’s called DrawBench, which compares the Google model to those mentioned above.
This benchmark is used to test how the aspects described in the text are transferred to the visuals, such as the composition, fidelity, cardinality and spatial relations of objects. The company highlighted some of the highlights of the imaging research conducted to develop this AI, such as the development of its new Efficient U-Net architecture, which is more computationally and memory efficient.
still under development
Google, for now this artificial intelligence It is not open source or globally accessible. This decision is due to possible potential risks of improper use by users. With this, he acknowledged that the initial tests with this AI allow rapid algorithmic advances thanks to data from the internet, and there are still many aspects that need improvement.
In this regard, it has been stated that these data do not reflect diversity, but rather “social stereotypes, derogatory or harmful connotations with oppressive viewpoints and marginalized identity groups”. He also noted that, despite performing a check with IMAGEN to filter the data they collected for their initial testing, the dataset uses the LAION-400M “inappropriate including pornographic images, racial slurs, and harmful social stereotypes.” contains “content.
It should be noted that a few weeks ago, the non-profit AI research company OpenAI introduced DALL-E, its new AI system that can transform words into realistic images. this technology can edit photos upon written request. This function includes the possibility to eliminate elements such as shadows, reflections and textures.
Source: Informacion
Barbara Dickson is a seasoned writer for “Social Bites”. She keeps readers informed on the latest news and trends, providing in-depth coverage and analysis on a variety of topics.