Utkarsh Singhal

57 posts

Utkarsh Singhal

Utkarsh Singhal

@utksinghal

Robotics @ Tesla Optimus | Previously: PhD @ UC Berkeley

Katılım Mayıs 2021
520 Takip Edilen119 Takipçiler
Utkarsh Singhal retweetledi
Chelsea Finn
Chelsea Finn@chelseabfinn·
Pi models are now running in production settings, in collab with @Ultraroboticsco and @weaverobotics. We see: - much higher autonomy with pi-0.6 over using pi-0.5 - fewer mistakes & higher throughput from incorporating data in pre-training Blog post: pi.website/blog/partner
English
9
48
398
27.7K
Utkarsh Singhal retweetledi
Haoru Xue
Haoru Xue@HaoruXue·
It was absolutely disgraceful seeing the venue people just rudely cut the mic. Even when @breadli428 repeatedly beg for just 3 more minutes and he will be done, the person simply repeated robotically: you should have been done. I was clearing rooms one by one. It’s the event organizer’s fault. Absolutely no respect at all.
Chenhao Li@breadli428

The toughest moment in a PhD: >spend a year building smth you’re proud of >travel across world with advisors’ support to share it >when your moment finally comes >your mic gets cut because the previous schedule ran late Heartbroken, but thanks to whom stayed for me @NeurIPSConf

English
5
4
250
51.6K
Antonio Loquercio
Antonio Loquercio@antoniloq·
🤖🇮🇹🥳
Penn Engineering@PennEngineers

Congratulations to Antonio Loquercio (@antoniloq) on receiving the 2025 Mario Gerla Young Investigator Award from @issnaf. Recognized for his research on the pivotal role of perception in building effective world models for decision-making, Loquercio enhances the performance of complex robotic systems. He explores how robots can utilize their own sensor data to refine their world models. Congratulations, Antonio! bit.ly/4pbaUSE

ART
1
2
27
3.6K
Utkarsh Singhal
Utkarsh Singhal@utksinghal·
@_onionesque Also the finding in question is explained in the first ~5-ish pages of the report. They didn’t even read the summary!
English
0
0
1
43
Shubhendu Trivedi
Shubhendu Trivedi@_onionesque·
Very misleading post. Low ROI from generative AI pilots is mostly a reflection of immature AI strategy (still very rudimentary in most companies across verticals) and execution (better, but still lagging), not because of the state of generative models.
The New Yorker@NewYorker

An M.I.T. study found that 95% of companies that had invested in A.I. tools were seeing zero return. It jibes with the emerging idea that generative A.I., “in its current incarnation, simply isn’t all it’s been cracked up to be,” @JohnCassidy writes. nyer.cm/FUZwzw8

English
2
0
8
1.2K
Utkarsh Singhal
Utkarsh Singhal@utksinghal·
@SwayStar123 Matches my observations too! One thing I've found interesting is that landscapes/scenes converge the fastest. Is it a coincidence that the first large-ish generative models (e.g., iGANs) were for landscapes?
English
0
0
0
30
sway
sway@SwayStar123·
Tried adding contrastive flow matching loss. Worse FID @ 100k. I think whats happening is that CFM might only be beneficial for low step sampling? I am doing these evaluations at 250 steps whereas CFM mysteriously only reports at like 50 steps or something. You can also see the FID gain tapers out in one of their chart
sway@SwayStar123

INVAE + REG = 7.15 FID @ 100k steps Original SiT-XL/2 gets 8.3 FID @ 7M steps. So something like 70+ times faster training?

English
2
2
50
8.1K
Utkarsh Singhal
Utkarsh Singhal@utksinghal·
Check out our work @ ICML tomorrow! I can't join in person 🥲, but I'm genuinely excited to share this work. FoCal hits the core issues I've encountered working on invariance over the last few years: complex transforms, scaling to foundation models & data-driven invariance 😀
Ryan Feng@ryantfeng

Q: How do we scale robustness/invariance to foundation models like CLIP? A: Test-time search! 🔍 Our new work FoCal finds canonical views to boost robustness to complex transforms (e.g. viewpoint): sutkarsh.github.io/projects/focal 📍 ICML Poster: Tue 11–1:30, E. Hall A-B (E-2203) 🧵 1/5

English
0
0
2
211
Utkarsh Singhal
Utkarsh Singhal@utksinghal·
@aaron_defazio @torchcompiled Would love to read the paper! I have seen similar phenomena with noisy gradient estimates. My guess before seeing the paper: noise accumulates in the weights over many batches until it starts producing 2nd order (hessian) effects.
English
0
0
1
492
Aaron Defazio
Aaron Defazio@aaron_defazio·
@torchcompiled I have been working on a NeurIPS paper that fully explains this phenomenon. The fix is one line of code! I’ll share it as soon as I’m able.
English
7
4
201
7K
Ethan
Ethan@torchcompiled·
why does grad norm continually grow with training? this is mad unintuitive. absolutely destroying my mental model of 2D convex bowl optimization.
Ethan tweet media
English
45
14
454
75.1K
Utkarsh Singhal
Utkarsh Singhal@utksinghal·
@EliSennesh Is this even a hot take? Since all EBMs are probabilistic models and vice versa? (The only difference being implementation, which i will conveniently ignore :) )
English
1
0
1
60
Utkarsh Singhal retweetledi
Chris Offner
Chris Offner@chrisoffner3d·
Our gloriously picked cherries. Their wicked failure cases.
Chris Offner tweet media
English
4
23
289
19.9K
Utkarsh Singhal
Utkarsh Singhal@utksinghal·
Hi all, I'll be at NeurIPS from Tuesday to Sunday. I would love to chat with people about invariance, test-time optimization, and vision. Please DM me if you'd like to talk (or catch up)
English
0
0
5
209
James Allingham
James Allingham@JamesAllingham·
I'll be at NeurIPS next week, presenting our work "A Generative Model of Symmetry Transformations." In it, we propose a symmetry-aware generative model that discovers which (approximate) symmetries are present in a dataset, and can be leveraged to improve data efficiency. 🧵⬇️
James Allingham tweet media
English
3
25
158
14.1K
Mathieu
Mathieu@miniapeur·
Tell me your story. How did you get into research?
English
17
2
23
6.1K
Utkarsh Singhal
Utkarsh Singhal@utksinghal·
@wgussml Really cool observations! Might be related to some of @thisismyhat 's recent work :) Personally I wonder if this is because (1) the loss functional is convex in function space; (2) averaging gradients ≈ averaging learned functions, at least for MNIST.
English
1
0
2
141
william
william@wgussml·
in this experiment we take n random initializations and train them on disparate data but average their gradients and update them all with this single mean gradient (this is not an ensemble)! surprisingly on MNIST, this mean gradient leads to faster convergence (even when adjusting for number of examples seen) !???? this property doesn’t hold in general and on cifar eg the opposite is true (which you should expect) it’s interesting to me that the loss landscape of mnist is such that the “mean path” of adam yields a global optima. likely a statement of dataset simplicity but surprising nonetheless
william@wgussml

what in the actual f*ck this is incredible

English
2
1
15
2.5K