• bitcoinBitcoin$117,530.673.70%
  • ethereumEthereum$2,991.116.28%
  • rippleXRP$2.8112.69%
  • binancecoinBNB$696.263.12%
  • solanaSolana$164.022.86%

Google Debuts On‑Device Gemini Robotics Model

Google Debuts On‑Device Gemini Robotics Model

Google DeepMind’s new Gemini Robotics on‑device model enables offline vision‑language‑action tasks on robots, matching hybrid versions, and including an SDK

A new language model, Gemini Robotics On-Device, was published by Google DeepMind on Tuesday. This model is capable of performing tasks on robots locally without the need for an internet connection.

Gemini Robotics On-Device is a follow-up to the company’s March-released Gemini Robotics model, which enables the control of a robot’s movements. By employing natural language prompts, developers can refine and regulate the model to accommodate various requirements.

Google asserts that the model performs at a level comparable to the cloud-based Gemini Robotics model in benchmarks. The company says it surpasses other on-device models in general benchmarks, although it did not specify the models.

Google Debuts On‑Device Gemini Robotics Model
Source: Google

In a demonstration, the company demonstrated robots operating this local model, capable of performing tasks such as bag unzipping and clothing folding.

Google has stated that the model was initially trained for ALOHA robotics. Still, it was subsequently modified to operate on the Apollo humanoid robot by Apptronik and the Franka FR3 bi-arm robot.

Google Debuts On‑Device Gemini Robotics Model
ALOHA robotics | Source: Interesting Engineering

Google asserts that the bi-arm Franka FR3 effectively addressed scenarios and objects it had not previously encountered, such as assembling an industrial belt.

Additionally, Google DeepMind is releasing a Gemini Robotics SDK. According to the company, using these models on the MuJoCo physics simulator, developers can present robots with 50 to 100 demonstrations of tasks to train them on new tasks.

Other AI model developers are also exploring robotics. Hugging Face is not only developing open models and datasets for robotics, but it is also working on robots.

Nvidia is constructing a platform to create foundational models for humanoids. RLWRLD, a Korean startup sponsored by Mirae Assets, is creating foundational models for robots.

Previous Article

Abridge Valuation Doubles to $5.3 Billion

Next Article

Senators Skip 'Bipartisan' Crypto Market Structure Hearing