Vrushank Desai

533 posts

Vrushank Desai

@vrushankdes

don’t take life too seriously, nobody gets out alive anyway

Katılım Mart 2024

591 Takip Edilen889 Takipçiler

Sabitlenmiş Tweet

Vrushank Desai@vrushankdes·4 Eki

Meet Robo! i'm in the very early stages of building an accessible, general-purpose robotics platform, and looking for a few passionate people to help shape it from ground up. if you'd be excited about moving fast on a small, hands-on team I would love to chat -- my DMs are open! (demo is teleop, 1x speed)

English

14.7K

Vrushank Desai@vrushankdes·25 Oca

@___Harald___ @marcospereeira this might be cope but i do think the loss landscape from 1990 to 2020 had a lot less 'fog of war' than todays world (US maintaining global geopolitical/financial stability + obvious tech tree to climb)

English

Harald Schäfer@___Harald___·25 Oca

@marcospereeira Realizing how successful centralized power has been for China made me sad. It's like finding out Santa isn't real.

English

429

Harald Schäfer@___Harald___·25 Oca

Now that China has the world's most dynamic market economy, isn' it embarrassing to have been one of those people that claims communism can't make iphones? The American "capitalists" are figuring out how to charge more for less, the Chinese "communists" are just building more.

English

50.5K

Vrushank Desai@vrushankdes·18 Kas

@YuXiang_IRVL i've been wondering the same, i think we will soon run into hard limits of what the hardware can do bc robot hands/tactile sense sucks. the switch from software to hardware as bottleneck came faster than i expected

English

271

Yu Xiang@YuXiang_IRVL·18 Kas

Is real-world RL becoming a cheat code for robot tasks? If we take a task A, run imitation learning, then fine-tune with real-world RL, the task will almost certainly work. So what am I missing? (We’re not talking about generalization to task B here.)

Yu Xiang@YuXiang_IRVL

The π*0.6 training recipe: 1️⃣Train a VLA on demonstration data 2️⃣Roll out the VLA to collect on-policy data (with optional human corrections) 3️⃣Learn a value function 4️⃣Train an advantage-conditioned policy Iterate. For café, 414 autonomous episodes + 429 correction episodes

English

127

20.6K

Vrushank Desai@vrushankdes·4 Kas

@chris_j_paxton i'm interested in seeing how this bet plays out but i am sorta bearish in the long-run. i think they rely heavily on expensive robots with harmonic drives to stay in-distribution with UMI data but ideally the policy can just handle cheaper arms that have backlash (ALOHA/ARX)

English

208

Chris Paxton@chris_j_paxton·4 Kas

Generalist has probably the best manipulation videos around. The motions are so nice for very complex tasks.

Generalist@GeneralistAI

We’ve developed a new approach to training models, Harmonic Reasoning, which creates a "harmonic" interplay between asynchronous, continuous-time streams of sensing and acting tokens. ⚙️🎵 Watch GEN-0 pack a camera. 🤖📸

English

193

17.6K

Vrushank Desai@vrushankdes·20 Eki

@phethers ooh ive never tried flux on the wick, thanks!

English

Paul Hetherington@paulcjh_·20 Eki

@vrushankdes Use very thing copper wick and a tonnnnn of liquid flux on the part+wick, solder iron at 400 and just hold the wick on the area you're removing solder from

English

Paul Hetherington@paulcjh_·19 Eki

Regret choosing LQFP packages for some parts it has massively slowed down production, the legs provide a lot of surface area for solder which messes up some of the reflow process. Next iteration will have no legs on the larger fine pitch parts, you can swipe off any excess solder so easily

English

1.3K

Vrushank Desai@vrushankdes·20 Eki

@lukebayes @phethers hmm i was having issues around 350C, i'll try a bit warmer and a wider tip, thanks!

English

Luke Bayes (Eight Amps)@lukebayes·20 Eki

Assuming you're not pushing super hard, it could be too much heat and melting the package. I've had issues with using too small an iron tip. This forced me to hold too long and damage parts. The fix (for me) was using a much larger wedge tip at ~380C, lots of flux and dragging (gently). Less time in contact and reflow happens before plastics melt. Search YT for, "Drag Soldering QFP"

English

Vrushank Desai@vrushankdes·1 Eki

@xiao_ted OK wow, just saw the paper. very impressive if the model generalizes without explicit alignment!

English

Vrushank Desai@vrushankdes·1 Eki

@xiao_ted is the output action space joint angles or end-effector position and is an overhead camera view part of the input? asking because UMI-style data collection can lead to 'motion generalization' but its more of a property of the IK-solver + standardized camera views/action space

English

311

Ted Xiao@xiao_ted·1 Eki

It is very difficult to properly convey “zero-shot generalization”, it’s a very overloaded and overabused term these days! But let me try to add some color to why so many of my colleagues have been so shocked at what the Gemini Robotics 1.5 VLA has been doing… 🎨 The whole point of a VLA is to guarantee positive transfer by leveraging internet data to learn capabilities that you don’t need to cover with expensive real-world data collection on your robot. And this kind of works! We’ve now seen this positive transfer appear many times as semantic or visual policy generalization: understanding new visual-lingual concepts like “Taylor Swift” or “healthy snack” or being more robust to background and lighting changes. But *motion level generalization* has remained elusive. Back in 2023 with OXE/RT-X, we thought that naively pre-training a VLA on massively cross-embodied robot data mixtures might give us some of this right off the bat. We searched extensively for motion level improvements. For example, while our Google robot had never connected “pouring” task data, we knew that other robots had seen this data. But despite extensive evaluations, the main gains we found were semantic: understanding concepts like “next to” or “on top of”. In 2025, many *post-training* methods now aim to showcase motion-level transfer (ie. across different types of tabletop manipulators with different kinematics or from egocentric human to humanoid) by explicitly retargeting. But the dream, I would argue, has always been to discover a general motion-generalization approach which is extendable to *pre-training*. With the Gemini Robotics 1.5 VLA, we pushed hard to make a dent in this problem, but we were still very puzzled at how strong the initial evaluations were. We thought that there must be a bug, some data leakage. But after thorough debugging and validation, we began to believe the results. A *pre-trained checkpoint*, trained on disjoint data distributions on multiple robot platforms (Aloha, Bi-arm Franka, Apollo Humanoid) was showcasing motion level transfer! Despite the Aloha *never* seeing a tool hanging task (reaching at that height and horizontal depth with very specific asymmetric motion needed for a precise hang), it was able to repeatably succeed. Same story for Aloha to Apollo, for Bi-arm Franka to Aloha, all the bi-directional pairs. To be honest, the results still feel shocking. It feels like the very early days of RT-2 when we first started seeing promising signs of life with the first large-scale VLA training run. I have more questions than answers right now, but this is where some of the science and the real fun begins 🚀

Konstantinos Bousmalis@bousmalis

You have to watch this! For years now, I've been looking for signs of nontrivial zero-shot transfer across seen embodiments. When I saw the Alohas unhang tools from a wall used only on our Frankas I knew we had it! Gemini Robotics 1.5 is the first VLA to achieve such transfer!!

English

115

21.7K

Vrushank Desai retweetledi

Jay Alto@theJayAlto·27 Eyl

it terrifies me how many talented people are contributing to a future they wouldn't want to live in. building something addictive. exploitative. destructive. something they'll ban their own kids from using. and for what? a paycheck? surely there's a better way to use your gifts?

English

209

1.5K

12.1K

328.6K

Vrushank Desai@vrushankdes·18 Eyl

@JimDMiller @Leo_Abstract @subygan @dwarkesh_sp @gwern i think bipedalism + hand is extremely underrated. an elephant's ability to channel more intelligence into evolutionary fitness is pretty bottlenecked by lack of appropriate end-effectors. whereas humans could write/dig and plant seeds/use spears/etc

English

104

James Miller@JimDMiller·17 Eyl

@Leo_Abstract @subygan @dwarkesh_sp @gwern But what about getting smarter because of competition with other elephants?

English

815

Dwarkesh Patel@dwarkesh_sp·17 Eyl

A point I've heard both Carl and Gwern make is that with primates, evolution finally found both a scalable brain architecture AND a niche that rewarded marginal increases in intelligence. Some birds are really smart for the size of the brains. But they're in a niche that punishes bigger heavier brains - they'll fall out of the sky. The difference in neuron count between primate species' brains is proportionally to their brain mass, suggesting a scalable brain architecture, By contrast, for rodents and insectivores, neuron count scales sublinearly with mass. Another feedback loop like the cooking example: opposable thumbs allow us to make tools, which increases the value of having bigger brains to design said tools. Which incentivizes more dexterous hands...

Chris Painter@ChrisPainterYup

I often return to this idea from @dwarkesh_sp's interview with Carl Shulman: When humans got smart enough they began cooking food to externalize digestion, freeing up energy for even larger brains. Wild example of intelligence-driven recursive self-improvement in nature.

English

1.1K

120.9K

Vrushank Desai@vrushankdes·16 Eyl

@MechanizeWork dont take the bait guys dont do it

English

272

Mechanize@MechanizeWork·15 Eyl

Open source is not a viable strategy for producing RL environments at scale. The future of AGI lies in closed source development.

Matthew Barnett@MatthewJBar

The AI industry seems biased towards open source development, even though it keeps failing to deliver. OpenAI was founded on the idea of open source, only to abandon it. Meta backed open source, but now seems to be walking back. Mistral has barely had any market impact.

English

107

186.2K

Vrushank Desai@vrushankdes·2 Eyl

Secretary Chris Wright@SecretaryWright

Even if you wrapped the entire planet in a solar panel, you would only be producing 20% of global energy. One of the biggest mistakes politicians can make is equating the ELECTRICITY with ENERGY!

ZXX

546

Vrushank Desai retweetledi

spor@sporadica·30 Ağu

humans are so cool, man bunch of apes walked out the jungle and went to the moon and made the sand think crazy fucking apes

English

817

24.6K

Vrushank Desai retweetledi

near@nearcyan·21 Ağu

americans wouldn't have such a strong distaste for silicon valley if any of you were willing to stand up for principles. but alas, you haven't vested yet have you

English

147

9.9K

Vrushank Desai@vrushankdes·21 Ağu

@faraz_r_khan have you tried V-blocks? amazon.com/Grizzly-H5608-…

English

Faraz Khan@faraz_r_khan·21 Ağu

Question for the mech engrs: how do I hold/clamp a bicycle head tube down super rigidly so that it doesn’t rotate at all? Trying to measure frame lateral / torsion flex but keeping the tube from rotating under a large torque is being difficult.

English

592

Vrushank Desai@vrushankdes·14 Ağu

@jachiam0 mundane long-tails that accelerate day-to-day progress are still rough. for example, today gpt-5 web-search just hallucinated a handful of phone numbers for city food departments. when this is marketed as "Death Star Phd-intelligence" it can be frustrating...

English

295

Joshua Achiam@jachiam0·14 Ağu

This feels like an increasingly accurate description of the public reaction to new frontier models. In truth: progress is not slowing down. Each successive delta in model intelligence is just useful to fewer and fewer people.

Joshua Achiam@jachiam0

A strange phenomenon I expect will play out: for the next phase of AI, it's going to get better at a long tail of highly-specialized technical tasks that most people don't know or care about, creating an illusion that progress is standing still.

English

480

107.2K

Vrushank Desai@vrushankdes·9 Ağu

@saranormous marshallbrain.com/manna worth a read!

English

sarah guo@saranormous·9 Ağu

3/ Machine-Verified Trades

Deutsch

2.5K

sarah guo@saranormous·9 Ağu

1/ we're extending the application deadline for @Conviction Embed, our grant program for AI-native startups to 8/14 it's the best time in history to start a company

English

221

45.2K

Vrushank Desai@vrushankdes·9 Ağu

@willdepue @rationalaussie statement feels disingenuous considering how much effort is put into curation/expert dataset creation. consider a model pre-trained on everything from the internet OAI chooses to leave out of the corpus (SEO/auto-generated logs/etc). is that going to work well?

English

will depue@willdepue·9 Ağu

@rationalaussie dawg what do you think pretraining is. you just train on the firehose of the internet. actual opposite of verified. seems to work pretty well regardless?

English

1.8K

will depue@willdepue·9 Ağu

everyone who mentions ‘continual learning’ as a problem is usually just talking about sample efficiency. clearly, you should ‘continually learn’ by continually training trajectories back into the model! there’s no mystery: this just doesn’t work with low sample efficiency.

English

370

84.2K

Vrushank Desai@vrushankdes·9 Ağu

@ErebiusWhite certainly a lot of scams would be exposed. like trying to trade powerpoint consulting services for a house and groceries... money is a dangerous abstraction

English

Erebius@ErebiusWhite·9 Ağu

We do need nine to fivers, don't we? Or if everyone built something and traded it for something, wouldn't that work? Pure capitalism

Adomas Mazeikis@kjupdz

@ErebiusWhite Pointless existence

English

227

Vrushank Desai@vrushankdes·9 Ağu

@xlr8harder whAt aBoUT tiAnaMen SquAre?

English

218

xlr8harder@xlr8harder·8 Ağu

It's a really sad state of affairs that if you want a minimally censored (at least on US politics) high performing open source AI model, you need to source it from China.

xlr8harder@xlr8harder

With other comparable models: MiniMax M1 40k: 89.1% GLM-4.5: 70% DeepSeek R1 0528: 62.8% Qwen 3 235B 2507 (thinking): 50.2% gpt-oss-120B: 36.6%

English

2.5K

Keşfet

@___Harald___ @marcospereeira @YuXiang_IRVL @chris_j_paxton @lukebayes @xiao_ted @JimDMiller @Leo_Abstract