Nishanth Kumar

367 posts

Nishanth Kumar banner
Nishanth Kumar

Nishanth Kumar

@nishanthkumar23

AI/ML + Robots PhD Student @MIT_LISLab, intern @AIatMeta. Formerly @NVIDIAAI, @rai_inst, @brownbigai, @vicariousai and @uber.

Cambridge, MA Katılım Temmuz 2015
969 Takip Edilen1.8K Takipçiler
Sabitlenmiş Tweet
Nishanth Kumar
Nishanth Kumar@nishanthkumar23·
State-of-the-art robot policies often need hundreds of hours of data. What if we needed none? Introducing TiPToP: a manipulation system that zero-shots open-world tasks from pixels and language using vision foundation models and GPU-parallelized Task and Motion Planning (TAMP).
English
6
36
201
62.5K
Nishanth Kumar retweetledi
Wenlong Huang
Wenlong Huang@wenlong_huang·
What representation enables open-world robot manipulation from generated videos? Introducing Dream2Flow, our recent work that bridges video generation and robot control with 3D object flow. dream2flow.github.io @Stanford #ICRA2026 1/N
English
12
50
284
85.7K
Kevin Zakka
Kevin Zakka@kevin_zakka·
Impending graduation is making me really sad, PhD has been so lovely 😭
English
8
0
130
6.6K
Eugene
Eugene@VorobiovEugene·
@nishanthkumar23 Awesome, would love to be able to try it out in sim :)
English
1
0
1
11
Nishanth Kumar
Nishanth Kumar@nishanthkumar23·
State-of-the-art robot policies often need hundreds of hours of data. What if we needed none? Introducing TiPToP: a manipulation system that zero-shots open-world tasks from pixels and language using vision foundation models and GPU-parallelized Task and Motion Planning (TAMP).
English
6
36
201
62.5K
Nishanth Kumar
Nishanth Kumar@nishanthkumar23·
@VorobiovEugene We’re getting some feedback from several beta testers and iterating rapidly to make it as easy to use as possible: hopefully will be able to release in the next week or so!
English
1
0
1
22
Eugene
Eugene@VorobiovEugene·
@nishanthkumar23 This is awesome, when are you planning to release the code?
English
1
0
0
20
Nishanth Kumar retweetledi
Lucy Cai
Lucy Cai@LucyCai9·
Imagine you told a robot to "find your car keys" in your apartment and it looked around, opened a drawer, and retrieved them for you. As a step towards that, I adapted TiPToP to run on the RBY1 humanoid in our lab! Here's an example instruction it follows: "Put the green block on the blue plate and the yellow block on the magazine." TiPToP helps plan over the right arm + single torso joint, but it's easy to unlock more joints -- even the base wheels -- for more expressive, real-world tasks. Humans find objects without thinking twice. One day, robots will too! 🤖
Nishanth Kumar@nishanthkumar23

State-of-the-art robot policies often need hundreds of hours of data. What if we needed none? Introducing TiPToP: a manipulation system that zero-shots open-world tasks from pixels and language using vision foundation models and GPU-parallelized Task and Motion Planning (TAMP).

English
4
13
89
14K
Nishanth Kumar
Nishanth Kumar@nishanthkumar23·
@VectorWang2 Yes! We’re actually quite excited to integrate FFS directly and see how much it impacts overall execution time!
English
0
0
1
86
Nishanth Kumar
Nishanth Kumar@nishanthkumar23·
@tonyzzhao Congrats!! Very excited for the new research and deployment milestones the team will hit in 2026! 🤖
English
0
0
1
105
Tony Zhao
Tony Zhao@tonyzzhao·
We raised $165M at a $1.15B valuation to stop doing demos. 2026 is about 1) deployment and 2) research. We will start shipping Memo with our new frontier models in a few months. Our series-B is led by Coatue, with Thomas Laffont joining the board. ->🧵
English
114
102
1.5K
354.9K
Nishanth Kumar
Nishanth Kumar@nishanthkumar23·
@chris_j_paxton Thanks for the signal boost Chris - this is a great way of putting things! We hope to keep working on finding the right tradeoff that will let us zero shot as much as possible with ideally also zero hand tuning :)
English
0
0
0
53
Nishanth Kumar
Nishanth Kumar@nishanthkumar23·
Good question - I’d say this system scales in a different way than end to end learning. It can in some sense scale with more test time computation, but you’re right that it doesn’t as easily scale to deformable objects or things that are hard to simulate. Ultimately it will be important to incorporate end to end learning, or perhaps even replace this entire thing with an end to end system, but I think studying systems that reason and scale at test time like this one can give us some useful insights towards a path forward for manipulation in general!
English
0
0
0
169
Jerry Chéng
Jerry Chéng@thejerrycheng·
Cool work but how does it scale up compare to the end to end learning? Tasks like deformable object manipulation, bimanual manipulation would be difficult in this pipeline. Also if one of the module fails, would it still have any meaningful output? It might end up to be a big state machine in the end
English
1
0
1
201
Guanming Wang
Guanming Wang@Guanming717·
@nishanthkumar23 Pick and place it makes sense as IK can baesd on the predicted final pose and give actions but for “earse” how does it understand the continuous movement without training data?
English
1
0
0
176
Nishanth Kumar
Nishanth Kumar@nishanthkumar23·
@wenlong_huang Thanks so much Wenlong! Looking forward to more discussions + future work from you on integrated planning and learning!
English
0
0
1
120
Wenlong Huang
Wenlong Huang@wenlong_huang·
Planning is the test-time compute for robotics. Like AlphaGo and reasoning LLMs, it discovers solutions / behaviors beyond what’s in the data. Having seen the demos in-person, I’m still impressed by the generality of the system. It worked surprisingly well with various instructions I came up on the fly and with random in-the-wild objects, all zero-shot. The low-level controller developed by the team also has the best tracking I have seen on a Franka. Huge congratulations to the team!
Nishanth Kumar@nishanthkumar23

State-of-the-art robot policies often need hundreds of hours of data. What if we needed none? Introducing TiPToP: a manipulation system that zero-shots open-world tasks from pixels and language using vision foundation models and GPU-parallelized Task and Motion Planning (TAMP).

English
2
6
54
6.8K
Nishanth Kumar
Nishanth Kumar@nishanthkumar23·
@_jingcao Thank you for your consistent hard work, and for getting this entire system in sim + evaling it so quickly! I wish I was half as capable as you are when I was an undergrad :)
English
0
0
1
120
Nishanth Kumar
Nishanth Kumar@nishanthkumar23·
@Bw_Li1024 Thanks for the kind words Bowen! Yes - it was very interesting to see that most failures were due to our grasping module - you're welcome to try it out and help us make it better when the code is released! 😉
English
1
0
1
59
Bowen Li
Bowen Li@Bw_Li1024·
Very impressive results. I really like the failure analysis in the thread, it seems that a big part to improve is the grasping failure (physical contact introduces sim-real difference), and maybe non-prehensile behaviors beyond a TAMP model’s definition :)
Nishanth Kumar@nishanthkumar23

State-of-the-art robot policies often need hundreds of hours of data. What if we needed none? Introducing TiPToP: a manipulation system that zero-shots open-world tasks from pixels and language using vision foundation models and GPU-parallelized Task and Motion Planning (TAMP).

English
1
0
4
726
Nishanth Kumar
Nishanth Kumar@nishanthkumar23·
@tomssilver Thanks so much for your comments + feedback on an earlier draft of this work! We can't wait for you to try it out super soon (we promise the code will be out ASAP!)
English
0
0
1
61
Nishanth Kumar
Nishanth Kumar@nishanthkumar23·
@skymanaditya1 haha - thanks for trying it out! Excited for some contributions of new robots/functionality from your work! :)
English
1
0
1
88
William Shen
William Shen@WillShenSaysHi·
TiPToP was a fun project with great collaborators! We were surprised at how fast and general it is across objects, setups, and embodiments. Its limitations point toward combining end-to-end VLAs with planning and reasoning. Code coming soon! 🚀
Nishanth Kumar@nishanthkumar23

State-of-the-art robot policies often need hundreds of hours of data. What if we needed none? Introducing TiPToP: a manipulation system that zero-shots open-world tasks from pixels and language using vision foundation models and GPU-parallelized Task and Motion Planning (TAMP).

English
1
2
21
1.6K
Nishanth Kumar
Nishanth Kumar@nishanthkumar23·
@JorgeAMendez_ Thanks Jorge! Let us know any feedback, and we hope you'll try out the system for yourself! :)
English
0
0
1
44
Dylan Sam
Dylan Sam@dylanjsam·
I defended my PhD thesis! Also, a very (~4 month) late life update, but I've joined @OpenAI to work on safety research and pretraining safer language models! 📈 Thank you to my advisor @zicokolter and my committee: Matt Fredrikson, @andrew_ilyas, and @furongh! 🙏
Dylan Sam tweet media
English
25
8
218
20.6K
Nishanth Kumar
Nishanth Kumar@nishanthkumar23·
We hope you'll try TiPToP out and consider contributing! While we're excited by TiPToP's current capabilities, we also feel there's so much more to be done (check out the website for a list of things to be worked on). 🌐 Project: tiptop-robot.github.io 📄 Paper: arxiv.org/abs/2603.09971 💻 Code (coming soon; we're working hard on making it easy-to-run!): github.com/tiptop-robot/t… TiPToP was a big team effort and wouldn't have been possible without @WillShenSaysHi, @sahitbot_irl, @JieWang_ZJUI, Christopher Watson, @edward_s_hu, @_jingcao, @dineshjayaraman, Leslie Pack Kaelbling, and Tomás Lozano-Pérez. Special thanks to the folks at Penn for their help with evaluation!
English
1
0
11
877
Nishanth Kumar
Nishanth Kumar@nishanthkumar23·
TiPToP is far from perfect: - Open-loop execution → no recovery from failed grasps - Single-viewpoint perception → limited visibility - Lacks closed-loop reactivity of VLAs We view TiPToP as a test-time scaling and reasoning method that's ultimately complementary to large robot foundation models like VLAs. We're excited about future research to more tightly combine these paradigms!
English
1
0
7
792