Google DeepMind Unveils On-Device Gemini Robotics Model for Local Task Execution

Coin WorldTuesday, Jun 24, 2025 10:23 pm ET

2min read

GOOG

--

Google DeepMind has introduced a new language model called Gemini Robotics On-Device, which can run tasks locally on robots without an internet connection. This model builds on the company’s previous Gemini Robotics AI model, released in March, and is designed to control a robot’s movements. The vision-language-action model (VLA) is small and efficient enough to run directly on a robot, allowing developers to control and fine-tune the model using natural language prompts.

According to

Google

, the new model performs at a level close to the cloud-based Gemini Robotics model in benchmarks and outperforms other on-device models in general benchmarks. Carolina Parada, Head of Robotics at Google DeepMind, noted that while the hybrid model is still more powerful, the on-device model is surprisingly strong and can be used as a starter model or for applications with poor connectivity.

In a demonstration, robots running the local model were shown unzipping bags and folding clothes. The model, initially trained for ALOHA robots, was adapted to work on a bi-arm Franka FR3 robot and the Apollo humanoid robot by Apptronik. The bi-arm Franka FR3 successfully tackled scenarios and objects it hadn’t seen before, such as doing assembly on an industrial belt. Developers can train robots on new tasks using the models on the MuJoCo physics simulator by showing them 50 to 100 demonstrations of tasks.

Google DeepMind also released a software development kit called the Gemini Robotics SDK. This SDK provides full lifecycle tooling necessary for using Gemini Robotics models, including accessing checkpoints, serving a model, evaluating the model on the robot and in the

sim

, uploading data, and fine-tuning it. The on-device Gemini Robotics model and its SDK will be available to a group of trusted testers while Google continues to work toward minimizing safety risks.

Other tech companies are also showing interest in robotics. Nvidia is building a platform to create foundational models for humanoids, with its CEO, Jensen Huang, noting that building foundation models for general humanoid robots is one of the most exciting problems to solve in AI today. Nvidia has been championing robotic innovation through initiatives like Isaac and Jetson, and last year joined the humanoid race with Project GROOT, a general-purpose foundation model for humanoid robots.

Hugging Face is developing open models and datasets for robotics and has revealed an OpenAI model for robotics called SmolVLA. The model is trained on community-shared datasets and outperforms much larger models for robotics in both virtual and real-world environments. Hugging Face aims to democratize access to vision-language-action (VLA) models and accelerate research toward generalist robotic agents. The firm also launched LeRobot, a collection of robotics-focused models, datasets, and tools, and recently acquired Pollen Robotics, a robotics startup, revealing several inexpensive robotics systems for purchase.

Comments

﻿

Add a public comment...

No comments yet

Disclaimer: The news articles available on this platform are generated in whole or in part by artificial intelligence and may not have been reviewed or fact checked by human editors. While we make reasonable efforts to ensure the quality and accuracy of the content, we make no representations or warranties, express or implied, as to the truthfulness, reliability, completeness, or timeliness of any information provided. It is your sole responsibility to independently verify any facts, statements, or claims prior to acting upon them. Ainvest Fintech Inc expressly disclaims all liability for any loss, damage, or harm arising from the use of or reliance on AI-generated content, including but not limited to direct, indirect, incidental, or consequential damages.