
Vangrid
15 posts

Vangrid
@vangrid_io
Building the decentralized perception grid for sovereign defense and autonomous physical agents.


JUST IN: @GoogleDeepMind launches Gemini Robotics ER 1.6! 🧠 GDM introduced Gemini Robotics-ER 1.6, a reasoning-first model that enables robots to understand environments through spatial reasoning and multi-view understanding. The model specializes in visual and spatial understanding, task planning, and success detection. It acts as the high-level reasoning model for robots, capable of calling tools like Google Search, vision-language-action models, or any third-party user-defined functions. New capabilities like instrument reading, enabling robots to read complex gauges and sight glasses, discovered through collaboration with Boston Dynamics. Precision object detection and counting, relational logic, motion reasoning, and constraint compliance. The model uses points as intermediate steps to reason about complex tasks. It enables agents to intelligently choose between retrying failed attempts or progressing to the next stage. The model advances multi-view reasoning, understanding multiple camera streams and relationships between them even in dynamic or occluded environments. Super important are the safety improvements. They have included superior compliance with safety policies, better adherence to physical safety constraints (safer decisions about which objects can be manipulated), and improved hazard identification. 🚧 So the high-level planning that calls lower-level execution models, versus the end-to-end visuomotor control approach of models like π0 and GEN-1. It's getting interesting! 🔥 More details here: deepmind.google/blog/gemini-ro… ~~ ♻️ Join the weekly robotics newsletter, and never miss any news → ziegler.substack.com












