Tech & Innovation - March 12, 2025

Google DeepMind's Gemini: The Next Step in AI-Powered Rob...

Image related to the article
In the realm of science fiction, artificial intelligence (AI) often powers an array of intelligent and sometimes dangerous robots. However, today's most advanced AI technologies are largely confined to chat windows. Google DeepMind is planning to change that with the introduction of Gemini, a new AI model designed to integrate language, vision, and physical action into robotics. The company has released several demonstration videos showing robots equipped with Gemini Robotics performing tasks in response to spoken commands. This represents a significant leap forward in the field of robotics, opening up a world of possibilities for the future.

Read more at source.

Gemini Robotics: A New Frontier

Gemini Robotics is a new model that enables robots to connect visible items with potential actions, allowing them to perform tasks as instructed. The model is trained in such a way that it can generalize behavior across different hardware. This advanced model has been showcased in demonstration videos where robots perform a variety of tasks, from folding paper to handling vegetables, all in response to spoken commands.

Gemini Robotics-ER: Embodied Reasoning

A version of the Gemini model, Gemini Robotics-ER, focuses on visual and spatial understanding. This model is designed for use by other robotics researchers to train their own models for controlling robot actions. In a video demonstration, Google DeepMind researchers used Gemini Robotics-ER to control a humanoid robot named Apollo, demonstrating the model's potential in real-world scenarios.

The Future of Robotics with Gemini

The development of Gemini and Gemini Robotics-ER represents a significant milestone in the field of AI and robotics. Google DeepMind's new models have demonstrated the ability to control robots successfully in hundreds of specific scenarios not previously included in their training. This general-concept understanding makes the model much more versatile and useful. The breakthroughs that led to the development of powerful chatbots like OpenAI's ChatGPT and Google's Gemini have raised hopes for a similar revolution in robotics.

We've been able to bring the world-understanding—the general-concept understanding—of Gemini 2.0 to robotics. The new model is able to control different robots successfully in hundreds of specific scenarios not previously included in their training. Once the robot model has general-concept understanding, it becomes much more general and useful. - Kanishka Rao, Robotics Researcher at Google DeepMind