The company showcased by Google at the international Search On event demonstrates how advances in artificial intelligence are reshaping information products. These innovations create multidimensional search experiences that align more closely with human thought and how people actually interact with information. Presented at Search On, three innovations reveal how searches can find exactly what users want by blending image, audio, text, and sound in a natural, human way.
A more intuitive image search features multi-modal capabilities, enabling a search that combines images with text in a single query for richer results.
Translating the environment. Using AI progress, Google is moving beyond converting text to translating images. Today, image translation is used more than a billion times each month to render image text in more than a hundred languages.
Immersive vision for exploring the world. With advances in machine vision and predictive modelling, maps evolve from two-dimensional layouts into multidimensional views of the real world, letting users experience places as if they were actually there.
From Google’s perspective, there is a glimpse into a future where people can locate exactly what they seek by combining images, sounds, text, and audio as humans naturally do.
image search
Cameras become a tool, the keyboard of the future, to access information and better understand surroundings. The Lens project, born in 2017, enables searching what is seen through the camera or an image. Today, Lens answers eight billion questions each month.
Image search grows more natural with multi-search, a new way to search using both images and text. A beta version of multi-search was rolled out in the United States a few months ago and, during Search On, Google announced its expansion to more than seventy languages in the coming months. A practical extension, even broader coverage, will enable users to snap a photo of something unfamiliar like a plate of food or a plant and locate nearby options such as a restaurant or garden centre. This feature will be available in English in the United States this fall.
turn the world
One of the strongest benefits of visual perception is its power to break language barriers. Google now translates image text, not just plain text, thanks to artificial intelligence. With over a billion translations per month, more than a hundred languages are supported. The meaningful results often come from a blend of words and their surrounding imagery, which helps convey intent. Today, translated text is combined with contextual images thanks to machine learning technology that leverages advanced networks. For example, pointing a camera at a magazine in another language will overlay translated text on relevant images on the screen.
gripping view
Advances in machine vision and predictive models are redefining maps. Classic, flat representations will transform into immersive, multidimensional views that let users experience a place in a personalized way.
Just as real-time traffic updates in Navigation have made maps more useful, the immersive view of Google Maps adds more context, such as weather conditions or crowd levels at a location. This helps users gauge what a place is like before visiting and decide where and when to go.
By combining enhanced world representations with predictive models, users can get a sense of what a place may look like tomorrow, next week, or a month from now. The initial version of this capability already includes aerial imagery for hundreds of symbolic locations, and plans are in place to extend the immersive landscape to five major cities in the near future.