Tuesday, June 30, 2026
Home / Technology / Google’s Gemini Omni turns images, audio, and text...
Technology

Google’s Gemini Omni turns images, audio, and text into video — and that’s just the start

CN
CitrixNews Staff
·
Google’s Gemini Omni turns images, audio, and text into video — and that’s just the start

When Google launched Gemini three years ago, the goal was to build a multimodal large language model — a single neural network that was trained on text, image, audio, and video and could generate content in any of those formats.

Today, at its Google I/O developer conference, the company took a concrete step toward that goal with Gemini Omni, a new family of multimodal models that Google CEO Sundar Pichai says will be able to “create anything from any input.” 

Originally reported by TechCrunch. Read the full story at the original source.