Pannag Sanketi

759 posts

Pannag Sanketi

Pannag Sanketi

@pannag_

Tech Lead Manager / Researcher @GoogleAI @GoogleDeepMind Robotics. Open X-Embodiment Co-Lead. Table Tennis Robots Lead. @UCBerkeley @iitmadras alum.

Katılım Eylül 2009
1K Takip Edilen894 Takipçiler
Sabitlenmiş Tweet
Pannag Sanketi
Pannag Sanketi@pannag_·
I will be at #NeurIPS2025 this week! I will be giving a talk on the Table Tennis Robotics project at Google Deepmind at the MyoSymposium on Athletic Intelligence (sites.google.com/corp/view/myos…) and participating in discussing our two papers at NeurIPS:
English
2
0
3
454
Andy Zeng
Andy Zeng@andyzengineer·
Back in grad school, live ML robot demos used to be a tall order—bump a camera, it’s over. It was SOTA to duct-tape “DO NOT TOUCH” signs everywhere 😭 and pray the moon was in the right phase… Now? Turns out, with enough pretraining data, physical AI foundation models like GEN-0 don’t care—generalizes to new environments, lighting, robots… it just works. Anywhere. Millions of tasks, thousands of settings, a lifetime of physical experience, all baked into model weights. It feels like we're entering a new era for intelligent robots, and I am SO excited for it.
Generalist@GeneralistAI

We ran a live demo @nvidia GTC last week, but the real story is how quickly we got it running. The system was up and running in days, not weeks. This is a step toward robots that can be deployed quickly without task-by-task programming. How we made it happen👇 🧵 (1/6)

English
4
10
77
8.1K
Guri Singh
Guri Singh@heygurisingh·
🚨 BREAKING: Someone just open-sourced a full pipeline that trains a humanoid robot to play table tennis by watching humans play. It's called Robomotion and it's beyond insane. Here's everything you need to know: → Takes raw motion-capture footage of real table tennis players → Filters out noisy data -- only keeps smooth, physically plausible trajectories → Trains a Unitree G1 humanoid using PPO reinforcement learning → Runs millions of parallel episodes on GPU with MuJoCo MJX → Randomizes friction, mass, center of mass so the policy transfers to real hardware → Exports to ONNX format -- deploys directly on the physical robot Most robotics teams spend months hand-tuning controllers for a single motion. This system watches a human, learns the skill, and executes it with the robot's own body dynamics. No teleoperation. No manual programming. JAX + Brax + MuJoCo. 100% Open Source. (Link in the comments)
English
38
68
412
144.7K
Pannag Sanketi retweetledi
Mohana Krishna (మోహన కృష్ణ)
Adding to this with a personal experience — I've witnessed this firsthand at Virupaksha Temple, Hampi. That glowing inverted image behind us? It's the 52-metre gopuram — projected through a tiny opening in the wall. A pinhole camera. Built centuries ago. Our ancestors weren't worshipping in spite of science. They were worshipping through it. Science. Geometry. Astronomy. Devotion. One seamless pursuit. 🇮🇳 @AnandMahindra — Hampi deserves its own visit on your list too. 🙏
Mohana Krishna (మోహన కృష్ణ) tweet media
English
12
48
325
84.8K
Yann LeCun
Yann LeCun@ylecun·
I think you missed the main ideas. - The basic premise of JEPA is that training by reconstructio/prediction in input space is evil (or counterproductive). The details are almost always unpredictable. Hence prediction must take place in representation space, where unpredictable details are eliminated. - The main issue with JEPA is how to prevent collapse (in the absence of reconstruction loss). There are two classes of methods: (1) EMA: Using weights in target encoder that are an exponential moving average (EMA) of the weights in other encoder (I-JEPA, V-JEPA, DINO, BYOL). (2) Infomax: Using a regularizer that attempts to maximize the information content of the representation (e.g. over a batch). There are two sets of methods for that: (2a) sample-contrastive methods: that want to make each representation vector different from the others (Siamese nets, DrLIM, SimCLR, etc). They tend to not work well in high dimension, to require large batches, and hard negative mining (2b) dimension-contrastive methods: that want to make each variable independent from the others (Barlow Twins, VICReg, SIGReg/ LeJEPA, MMCR, MCR2....) Bottom line: A. SSL by reconstruction/prediction doesn't work for high-dim, continuous, noisy data B. EMA sucks: no loss function being minimized, requirement for weightmsharing.... C. Sample-contrastive informax doesn't scale to high dimension D. My money is on dimension-contrastive methods like SIGReg/LeJEPA
English
63
110
1.1K
153.4K
rohan anil
rohan anil@_arohan_·
On a long flight, I finally decided to dive into what JEPA is all about. You can convert an encoder decoder into JEPA by the following: - target encoder replaced by moving average of encoder to avoid collapse - Use a projection to get a summary embedding, instead of token embedding for both input and target - use all the clever loss to avoid scale sensitivity If you want tokens out, slap a decoder ontop of the summary representation. Feels like all of this could be an ablation.
English
16
26
553
276.6K
Pannag Sanketi
Pannag Sanketi@pannag_·
@JitendraMalikCV Congrats Jitendra on the great journey at Meta! Very much looking forward to the work coming out of FAR!
English
0
0
1
342
Jitendra MALIK
Jitendra MALIK@JitendraMalikCV·
1/4 For the last several years I worked part-time at the FAIR lab at Meta, in addition to being a professor at UC Berkeley. That phase is now over, and starting Jan. 5, I will be leading a robotics research effort at Amazon FAR in San Francisco, while continuing at Berkeley.
English
50
46
1.5K
307K
Pannag Sanketi
Pannag Sanketi@pannag_·
@matthewsyed They will make all the talks available online soon. I will keep you posted. Thanks!
English
0
0
0
28
Matthew Syed
Matthew Syed@matthewsyed·
@pannag_ Is this talk available to watch online? If you could follow, I’d like to DM. Your research looks v exciting!
English
1
0
0
143
Pannag Sanketi
Pannag Sanketi@pannag_·
I will be at #NeurIPS2025 this week! I will be giving a talk on the Table Tennis Robotics project at Google Deepmind at the MyoSymposium on Athletic Intelligence (sites.google.com/corp/view/myos…) and participating in discussing our two papers at NeurIPS:
English
2
0
3
454
Pannag Sanketi retweetledi
GDP
GDP@bookwormengr·
Congrats Pieter Abbeel @pabbeel - Amazon's new AGI Head. TPOT needs no introduction to him. He is a pioneer in Robotics and DeepRL. Has 231K citations. He is Prof UC Berkeley; has been an Amazon Distinguished Scientist / VP / Scholar since 2024 focussed on Advancing AI and Robotics. He has advised legendary doctoral students like Chelsea Finn (@chelseabfinn) of Physical Intelligence and John Schulman (@johnschulman2) - ex OpenAI, ex Anthropic and now co-founder of Thinking Machines. Looking forward.
GDP tweet media
English
29
84
1.5K
189.3K
Pannag Sanketi retweetledi
Demis Hassabis
Demis Hassabis@demishassabis·
Gemini has always had exceptionally strong multimodal capabilities. Gemini 3 Pro is an incredible vision AI model and is SOTA across all main vision & multimodal benchmarks. It’s great for document, screen, image, video & spatial understanding tasks - try now in the @GeminiApp!
Demis Hassabis tweet media
English
134
175
2.1K
242.4K
Pannag Sanketi
Pannag Sanketi@pannag_·
Looking forward to catching up with old friends and making new ones! Message me on X or LinkedIn if you would like to chat / grab a coffee together.
English
0
0
0
57
Tony Zhao
Tony Zhao@tonyzzhao·
Today, we present a step-change in robotic AI @sundayrobotics. Introducing ACT-1: A frontier robot foundation model trained on zero robot data. - Ultra long-horizon tasks - Zero-shot generalization - Advanced dexterity 🧵->
English
434
649
5.4K
2M
Ted Xiao
Ted Xiao@xiao_ted·
After 8 unforgettable years, I have decided to leave Google DeepMind. I feel immensely grateful to have had the opportunity to help transform the dream of general-purpose robot learning from a heretical fringe idea into a normalized technology roadmap. It has been the honor of a lifetime to work on the most challenging and important problems of our time with the brightest, kindest, and most talented colleagues I could have wished for. Thank you to Julian and Vincent for taking a chance on me back in 2017, when a ragtag team at Google Brain began exploring the potential for end-to-end learning on arm farms in the real world. The team has always dreamed big: my “starter project” with Corey and Pierre was to work on a goal-conditioned imitation policy capable of going from any initial condition (latent embedding) to any goal state. That 3-month project turned into a 2-year endeavor! But even though research ambitions were lofty, colleagues and mentors have always been grounded and compassionate by default. Alex H, Karol, Julian, and Sergey supported my vision of concurrent control RL at scale while allowing me the space to grow into a creative researcher on my own terms. The team’s technical progress and my own research taste began to accelerate substantially in 2020, when Kanishka and Karol inspired the whole team to bet big on one single crazy moonshot: a general robot policy that could accomplish thousands of household manipulation tasks. Such an unprecedented group effort was new to the whole team but extremely satisfying—to learn how to harmoniously navigate 0-to-1 real-world systems scaling (robot fleets, teleoperators, scaled learning stacks) alongside rigorous scientific exploration (an objective comparison of the scaling properties of imitation and reinforcement learning). I learned so much from all my comrades-in-arms during this time, and even to this day, many of my research and engineering intuitions draw from the lessons I learned from Eric, Yao, Alex I, Keerthana, and Yevgen. The following period, starting in 2022, was absolutely magical and unique in the breadth and depth of imaginative explorations that I was privileged to contribute to and lead. Exploring the potential of foundation models for robotics changed my research outlook permanently, and projects like SayCan, RT-1, and RT-2 felt like the first magically viral moments when the world started thinking more seriously about what the promise of general and performant embodied AI might look like. When the first generalist VLAs began to reliably perform tasks that we hadn’t collected data for, it was a huge lightbulb moment for our team and the field. During this time, I was immensely inspired by what high agency, manic creativity, and blazing iteration speed can do for research, learning from extremely kind and productive colleagues like Fei, Brian, Andy, Pete, Quan, Harris, and Danny. I applied this approach of wildly creative research to areas I cared about, such as creating better action representations, understanding robot generalization, and leveraging VLMs for data quality and augmentation. I am grateful to teammates who joined me on these adventurous explorations, such as Chelsea, Dorsa, Jonathan, Wenhao, Tianli, Montse, Sean, Austin, Kelly, and Paul. I also deeply appreciate all the academic collaborations during this time—ranging from multi-institution cross-embodiment learning to open-source VLAs to scalable offline evaluation to organizing workshops. Thank you, students, interns, and friends; in particular, Soroush, Jiayuan, Laura, Xuanlin, Kyle, Karl, Oier, Dhruv, Annie, Jensen, Priya, Suneel, Ike, Homanga, Hao, and Xuesu. In the final chapter of my career at GDM, starting in 2024, I became enamored with the science and impact of frontier models and how to harness them properly in robotics. It always fundamentally bugged me that robot learning often looked like “classical” machine learning of just fitting simple distributions with small models, rather than the polished scaled systems and science of how frontier models are developed with pre-training, mid-training, and post-training. I wanted to learn about that world and figure out how to make AGI understand the physical world. I am proud of the progress we have made, and from where we started with Gemini 1.0 to today, the research innovations we have unlocked have placed both Gemini and Gemini Robotics clearly at the forefront of both fundamental world understanding and general VLA control. Thank you so much to my teammates in Embodied Reasoning who make every day bright, interesting, and fun: Fei, Jacky, Laura, Wentao, Annie, Lewis, Ksenia, Mohit, Sean, and Danny. Thank you to friends in Gemini Multimodal who taught me how to frontier model: Xi, Karel, Ishita, and Xudong. Thank you to the VLA whisperers who have shown me how very far innovation and perseverance can take you: Coline, Giulia, Claudio, Alex L, Sumeet, Ashwin, Sudeep, Debi, and Ayzaan. Thank you to mentors throughout the years who have provided shining examples that velocity and impact, and compassion, are not zero-sum: Carolina, Jie, Kanishka, Nicolas, Jonathan, Pierre, Vincent, Karol, Sergey, Chelsea, and Julian. Thank you, thank you, thank you. It has been such an unbelievable adventure, and I am so fortunate to have been part of the crazy team that started the technology breakthroughs transforming the world into one where general and helpful embodied AGI is ubiquitous in society. I will always be #1 GDM fan! As for my own journey, I will be embarking on a new adventure, both familiar and very different, and hope to have more to share soon.
Ted Xiao tweet mediaTed Xiao tweet media
English
46
8
469
42.5K
Pannag Sanketi
Pannag Sanketi@pannag_·
@danijarh @GoogleDeepMind Thanks for your awesome work at GDM, Danijar! Congrats on your fantastic achievements and journey here. Good luck on your next adventure!
English
1
0
1
258
Danijar Hafner
Danijar Hafner@danijarh·
Today is my last day at @GoogleDeepMind. After almost exactly 10 years at Google including 12 internships and the last 2 1/2 years full time, it really feels like a chapter coming to an end. I'm grateful for all the experiences and friends I've made at Google and DeepMind. I still remember my first Brain internship in Mountain View in 2016 with James Davidson and @V_Vanhoucke, at a time where nobody had a working PPO implementation and we were wrangling with TensorFlow graphs 😄 The moment @lukaszkaiser showed us the first plausible Wikipedia page generated by a "big" LSTM. @ashVaswani full of excitement explaining the compute efficiency of a new architecture that later became the Transformer and asking me to try it for RL (I did not :P) The excitement to work on Deep RL and generative models at DeepMind during my master's in London, which turned into PlaNet with @countzerozzz and @itfische. Figuring out Karl Friston's free energy principle with Nicolas Heess and @AdaptiveAgents (which took a few more years to get right). Spending a good part of my PhD at the Brain Team in Toronto working on multiple generations of Dreamer with @mo_norouzi, various collaborations, and celebrating the Turing Award with @geoffreyhinton. And over the last few years working from Berkeley/SF on world models with @wilson1yan with significant resources thanks to @countzerozzz and @koraykv, and seeing video models & world models accomplish results that seemed completely out of reach just a few years ago. With mixed feelings but also excitement, it's time to start a new chapter!
Danijar Hafner tweet media
San Francisco, CA 🇺🇸 English
138
46
2K
292K
Pannag Sanketi retweetledi
Harsh Goenka
Harsh Goenka@hvgoenka·
Forget Shark Tank, forget Ideabaaz, this pitch stole my heart….
English
305
3.6K
19.9K
567.8K
Kevin Zakka
Kevin Zakka@kevin_zakka·
Super happy and honored to be a 2025 Google PhD Fellow! Thank you @Googleorg for believing in my research. I'm looking forward to making humanoid robots more capable and trustworthy partners 🤗
Google.org@Googleorg

🎉 We're excited to announce the 2025 Google PhD Fellows! @GoogleOrg is providing over $10 million to support 255 PhD students across 35 countries, fostering the next generation of research talent to strengthen the global scientific landscape. Read more: goo.gle/43wJWw8

English
35
4
194
22.9K
Pannag Sanketi retweetledi
RoboHub🤖
RoboHub🤖@XRoboHub·
At the launch event, AgiBot demonstrated the ultra-low latency teleoperation capability of the G2 robot. An operator located in Beijing successfully crossed over 2,000 kilometers (1,243 miles) to remotely demonstrate precise shooting of a balloon floating in the Shanghai studio. The first shot missed due to the constant swaying of the balloon, but the second shot was successful, showcasing the robot's high precision and low latency performance.
RoboHub🤖@XRoboHub

AgiBot has formally unveiled its G2 humanoid robot, a system designed to transition into various industries and liberate humans from repetitive labor. G2 features high-performance joints, precision torque sensors, and an advanced spatial perception system, supporting quick deployment and multi-modal voice interaction. ► Factory Floor Performance: The G2 is engineered to industrial standards. In a safety belt lock production line, robots collaborate with human workers, performing tasks like pressing lock cores. The G2 collects production data to continuously train and iterate models (local server deployment ensures data privacy), steadily improving its operational ability. ► Mobility & Safety: The G2 navigates narrow factory aisles using dual LiDAR and full-panorama vision for environment sensing and collision detection. Its chassis is designed to overcome common obstacles (speed bumps, elevator gaps). It supports 24/7 continuous operation via autonomous return-to-charge and battery swapping. ► Humanoid Design Advantage: The G2's design includes a three-degree-of-freedom flexible waist, allowing it to mimic natural human movements like bending and side-leaning. This dramatically expands its operational workspace and enables seamless integration into existing human-centric production lines without costly modifications. ► Advanced Dexterity & Learning (Lab): The new G02 arm is the world's first cross-moment arm, featuring high-precision joint torque sensors that allow it to precisely sense external forces and adjust stiffness, mimicking human hand compliance. Using Real-Machine Reinforcement Learning (RL), the G2 can learn complex, delicate tasks like memory stick insertion in about one hour with minimal human intervention. ► Logistics & Grasping: In logistics sorting, the G2 uses a 19-degree-of-freedom mechanical dexterous hand (20N maximum fingertip force; 35kg capacity for hard objects) equipped with 3D tactile sensors to ensure it grasps securely without damaging items. Its full-body articulation (waist and legs) aids grasping and posture adjustment. ► Model & Data: G2's intelligence is powered by the Go-One Large Embodied Model (VLA architecture: Vision-Language-Latent Action) and the GE-One World Model (vision-centric predictive modeling), trained using the AgiBot Word true-machine dataset (over 500k downloads). ► Service & Interaction: The G2 is deployed as a guide/receptionist in settings like art museums. It uses its high-DOF head, arms, and waist to point to exhibits, maintains eye contact while navigating difficult spaces (chassis walks forward, body faces backward), handles specialized and random queries, and uses proactive safety features (stops movement, issues warnings) when people get too close.

English
2
26
124
18.3K
Pannag Sanketi retweetledi
Keerthana Gopalakrishnan
Keerthana Gopalakrishnan@keerthanpg·
The funnest part about being a roboticist is that you get to play with robots and call it “work”. Here’s Apollo w/ GR 1.5 trying to grab from my unyielding hand, like a toddler, a test of manipulation generalization. Cannot believe this is the worst humanoids will ever be!
English
6
15
265
38.3K
Pannag Sanketi retweetledi
University of California
University of California@UofCalifornia·
University of California faculty and alumni won five Nobel Prizes this week, setting a new record for the most faculty of one institution to achieve this great honor in a single year 🏅💙 💛 These remarkable achievement highlight the ongoing contributions of America’s #1 public research university and the central role of federal funding in advancing world-changing scientific inquiry. bit.ly/4qn6CZV
English
17
244
1.1K
84.7K