Read more at source.
Read more at source.
Gemini Robotics is a new model that enables robots to connect visible items with potential actions, allowing them to perform tasks as instructed. The model is trained in such a way that it can generalize behavior across different hardware. This advanced model has been showcased in demonstration videos where robots perform a variety of tasks, from folding paper to handling vegetables, all in response to spoken commands.
A version of the Gemini model, Gemini Robotics-ER, focuses on visual and spatial understanding. This model is designed for use by other robotics researchers to train their own models for controlling robot actions. In a video demonstration, Google DeepMind researchers used Gemini Robotics-ER to control a humanoid robot named Apollo, demonstrating the model's potential in real-world scenarios.
The development of Gemini and Gemini Robotics-ER represents a significant milestone in the field of AI and robotics. Google DeepMind's new models have demonstrated the ability to control robots successfully in hundreds of specific scenarios not previously included in their training. This general-concept understanding makes the model much more versatile and useful. The breakthroughs that led to the development of powerful chatbots like OpenAI's ChatGPT and Google's Gemini have raised hopes for a similar revolution in robotics.
We've been able to bring the world-understanding—the general-concept understanding—of Gemini 2.0 to robotics. The new model is able to control different robots successfully in hundreds of specific scenarios not previously included in their training. Once the robot model has general-concept understanding, it becomes much more general and useful. - Kanishka Rao, Robotics Researcher at Google DeepMind