Utkarsh Singhal

57 posts

Utkarsh Singhal

@utksinghal

Robotics @ Tesla Optimus | Previously: PhD @ UC Berkeley

Katılım Mayıs 2021

520 Takip Edilen119 Takipçiler

Utkarsh Singhal retweetledi

Chelsea Finn@chelseabfinn·25 Şub

Pi models are now running in production settings, in collab with @Ultraroboticsco and @weaverobotics. We see: - much higher autonomy with pi-0.6 over using pi-0.5 - fewer mistakes & higher throughput from incorporating data in pre-training Blog post: pi.website/blog/partner

English

398

27.7K

Utkarsh Singhal retweetledi

Haoru Xue@HaoruXue·7 Ara

It was absolutely disgraceful seeing the venue people just rudely cut the mic. Even when @breadli428 repeatedly beg for just 3 more minutes and he will be done, the person simply repeated robotically: you should have been done. I was clearing rooms one by one. It’s the event organizer’s fault. Absolutely no respect at all.

Chenhao Li@breadli428

The toughest moment in a PhD: >spend a year building smth you’re proud of >travel across world with advisors’ support to share it >when your moment finally comes >your mic gets cut because the previous schedule ran late Heartbroken, but thanks to whom stayed for me @NeurIPSConf

English

250

51.6K

Utkarsh Singhal@utksinghal·14 Kas

@antoniloq Congrats! :D

English

Antonio Loquercio@antoniloq·14 Kas

🤖🇮🇹🥳

Penn Engineering@PennEngineers

Congratulations to Antonio Loquercio (@antoniloq) on receiving the 2025 Mario Gerla Young Investigator Award from @issnaf. Recognized for his research on the pivotal role of perception in building effective world models for decision-making, Loquercio enhances the performance of complex robotic systems. He explores how robots can utilize their own sensor data to refine their world models. Congratulations, Antonio! bit.ly/4pbaUSE

ART

3.6K

Utkarsh Singhal@utksinghal·4 Kas

@_onionesque Also the finding in question is explained in the first ~5-ish pages of the report. They didn’t even read the summary!

English

Shubhendu Trivedi@_onionesque·3 Kas

Very misleading post. Low ROI from generative AI pilots is mostly a reflection of immature AI strategy (still very rudimentary in most companies across verticals) and execution (better, but still lagging), not because of the state of generative models.

The New Yorker@NewYorker

An M.I.T. study found that 95% of companies that had invested in A.I. tools were seeing zero return. It jibes with the emerging idea that generative A.I., “in its current incarnation, simply isn’t all it’s been cracked up to be,” @JohnCassidy writes. nyer.cm/FUZwzw8

English

1.2K

Utkarsh Singhal@utksinghal·8 Eki

@SwayStar123 Matches my observations too! One thing I've found interesting is that landscapes/scenes converge the fastest. Is it a coincidence that the first large-ish generative models (e.g., iGANs) were for landscapes?

English

sway@SwayStar123·5 Eki

Tried adding contrastive flow matching loss. Worse FID @ 100k. I think whats happening is that CFM might only be beneficial for low step sampling? I am doing these evaluations at 250 steps whereas CFM mysteriously only reports at like 50 steps or something. You can also see the FID gain tapers out in one of their chart

sway@SwayStar123

INVAE + REG = 7.15 FID @ 100k steps Original SiT-XL/2 gets 8.3 FID @ 7M steps. So something like 70+ times faster training?

English

8.1K

Utkarsh Singhal@utksinghal·15 Tem

Happening now 😀 ICML poster session, E. Hall A-B (E-2203)

Ryan Feng@ryantfeng

Q: How do we scale robustness/invariance to foundation models like CLIP? A: Test-time search! 🔍 Our new work FoCal finds canonical views to boost robustness to complex transforms (e.g. viewpoint): sutkarsh.github.io/projects/focal 📍 ICML Poster: Tue 11–1:30, E. Hall A-B (E-2203) 🧵 1/5

English

270

Utkarsh Singhal@utksinghal·15 Tem

Check out our work @ ICML tomorrow! I can't join in person 🥲, but I'm genuinely excited to share this work. FoCal hits the core issues I've encountered working on invariance over the last few years: complex transforms, scaling to foundation models & data-driven invariance 😀

Ryan Feng@ryantfeng

English

211

Utkarsh Singhal@utksinghal·24 Mar

@aaron_defazio @torchcompiled Would love to read the paper! I have seen similar phenomena with noisy gradient estimates. My guess before seeing the paper: noise accumulates in the weights over many batches until it starts producing 2nd order (hessian) effects.

English

492

Aaron Defazio@aaron_defazio·24 Mar

@torchcompiled I have been working on a NeurIPS paper that fully explains this phenomenon. The fix is one line of code! I’ll share it as soon as I’m able.

English

201

Ethan@torchcompiled·24 Mar

why does grad norm continually grow with training? this is mad unintuitive. absolutely destroying my mental model of 2D convex bowl optimization.

English

454

75.1K

Utkarsh Singhal@utksinghal·29 Ara

@AllanZhou17 my mute list keeps expanding 🥲

GIF

English

106

Utkarsh Singhal@utksinghal·23 Ara

@EliSennesh Is this even a hot take? Since all EBMs are probabilistic models and vice versa? (The only difference being implementation, which i will conveniently ignore :) )

English

Eli Sennesh@EliSennesh·22 Ara

Hot take: "The cortex is an energy-based model" is a reasonable generalization of "The cortex is a Bayesian generative model".

bioRxiv Neuroscience@biorxiv_neursci

A biologically-inspired hierarchical convolutional energy model predicts V4 responses to natural videos biorxiv.org/cgi/content/sh… #biorxiv_neursci

English

2.8K

Utkarsh Singhal retweetledi

Chris Offner@chrisoffner3d·15 Eki

Our gloriously picked cherries. Their wicked failure cases.

English

289

19.9K

Utkarsh Singhal@utksinghal·8 Ara

Hi all, I'll be at NeurIPS from Tuesday to Sunday. I would love to chat with people about invariance, test-time optimization, and vision. Please DM me if you'd like to talk (or catch up)

English

209

Utkarsh Singhal@utksinghal·7 Ara

@JamesAllingham @kayembruno @shreyaspadhy @JaviAC7 @DavidSKrueger @eric_nalisnick @jmhernandez233 Wonderful work! I love the beautifully drawn graphics especially. :) I've been working on a similar line of research (utkarsh.ai/projects/learn…). There is so much exciting stuff to be done. It would be great to chat at NeurIPS!

English

James Allingham@JamesAllingham·5 Ara

A big shoutout to all of my amazing collaborators who made this paper happen! @kayembruno @shreyaspadhy @JaviAC7 @DavidSKrueger Richard Turner @eric_nalisnick @jmhernandez233 This project was 2 years from conception to publication, but I am very excited about the result!

English

279

James Allingham@JamesAllingham·5 Ara

I'll be at NeurIPS next week, presenting our work "A Generative Model of Symmetry Transformations." In it, we propose a symmetry-aware generative model that discovers which (approximate) symmetries are present in a dataset, and can be leveraged to improve data efficiency. 🧵⬇️

English

158

14.1K

Utkarsh Singhal@utksinghal·7 Ara

@LiyuanLucas Would love to chat too!

English

510

Liyuan Liu (Lucas)@LiyuanLucas·6 Ara

Join Microsoft Research's Deep Learning team in Redmond as a Summer 2025 intern! 🎓 Apply at shorturl.at/wlQot 📍 I'll be at #NeurIPS2024 next week - let's connect and chat! Please help us share this post in your networks : ) #DeepLearning #Internship #MSR

English

289

37.7K

Utkarsh Singhal@utksinghal·26 Kas

@miniapeur I was told there would be cookies

English

Mathieu@miniapeur·26 Kas

Tell me your story. How did you get into research?

English

6.1K

Utkarsh Singhal@utksinghal·11 Kas

@wgussml Really cool observations! Might be related to some of @thisismyhat 's recent work :) Personally I wonder if this is because (1) the loss functional is convex in function space; (2) averaging gradients ≈ averaging learned functions, at least for MNIST.

English

141

william@wgussml·11 Kas

in this experiment we take n random initializations and train them on disparate data but average their gradients and update them all with this single mean gradient (this is not an ensemble)! surprisingly on MNIST, this mean gradient leads to faster convergence (even when adjusting for number of examples seen) !???? this property doesn’t hold in general and on cifar eg the opposite is true (which you should expect) it’s interesting to me that the loss landscape of mnist is such that the “mean path” of adam yields a global optima. likely a statement of dataset simplicity but surprising nonetheless

william@wgussml

what in the actual f*ck this is incredible

English

2.5K

Utkarsh Singhal@utksinghal·9 Kas

@MaxAifer Sounds exicting!

English

120

Max Aifer@MaxAifer·9 Kas

I'm launching a new reading group on spaces this weekend to talk about excerpts from @sirbayes's book on probalistic machine learning probml.github.io/pml-book/book2…. Who wants to join?

English

Keşfet

@Ultraroboticsco @weaverobotics @breadli428 @antoniloq @_onionesque @SwayStar123 @aaron_defazio @torchcompiled