Lars Ankile

149 posts

Lars Ankile

@larsankile

ML for robotics.

Palo Alto, CA Katılım Aralık 2012

593 Takip Edilen587 Takipçiler

Sabitlenmiş Tweet

Lars Ankile@larsankile·1 Eki

How can we enable finetuning of humanoid manipulation policies, directly in the real world? In our new paper, Residual Off-Policy RL for Finetuning BC Policies, we demonstrate real-world RL on a bimanual humanoid with 5-fingered hands (29 DoF) and improve pre-trained policies with ~15-75 minutes of robot interaction. By learning residual corrections on frozen BC policies using sample-efficient off-policy RL, we achieve significant improvements in sample efficiency, enabling policy finetuning directly on the hardware — to our knowledge, one of the first examples of this on a humanoid with bimanual dexterous hands. (If you know of other examples, let me know!)

English

260

51.8K

Lars Ankile@larsankile·2d

@marceltornev How'd it go? My dad's asking now too haha

English

Lars Ankile retweetledi

Marcel Torné@marceltornev·23 Şub

Just told my parents I’m hosting a “AI coding tool workshop” for them later this week. They’re so excited 😄 Has anyone here tried teaching their parents an AI coding tool before? How did it go?!

English

467

Lars Ankile@larsankile·4d

@Garmin @GarminFitness For fun, I tried signing up for business access at garmin.com/en-US/forms/Ga…, and it seems like that's not exactly straightforward either...?

English

Lars Ankile@larsankile·4d

@Garmin @GarminFitness why are you so opposed to me getting access to _my own data_ from my Garmin watch that _I bought_? It's maybe a lot to ask, but maybe you can try to not throttle/block my one API request I do per day to look at my data?

English

252

Lars Ankile@larsankile·5d

@Garmin, why do you not allow personal use of the Connect API?

English

223

Lars Ankile retweetledi

Younghyo Park@younghyo_park·8 Nis

What's different between these two BC policies? It's the same architecture, training budget, and data collection setup — the only difference is the controller gains! Controller gains are an understudied design parameter in robot learning. In our new work (w/ @BronarsToni*, @pulkitology), we show how they act as an inductive bias across BC, RL, and Sim2Real transfer, with real consequences on performance. Here's what we found 🧵 * Equal Contribution 📄arxiv: arxiv.org/abs/2604.02523 🔗website: younghyopark.me/tune-to-learn/

English

465

141.1K

Lars Ankile@larsankile·24 Mar

@1x_tech any updates on when my Neo will ship? 🙏

English

228

Lars Ankile@larsankile·12 Mar

@seungwookh This is really cool! I gotta admit, I had my doubts that this could even work when you first told me about it so big kudos for taking it all the way 🙏

English

493

Lars Ankile retweetledi

Seungwook Han@seungwookh·12 Mar

Can language models learn useful priors without ever seeing language? We pre-pre-train transformers on neural cellular automata — fully synthetic, zero language. This improves language modeling by up to 6%, speeds up convergence by 40%, and strengthens downstream reasoning. Surprisingly, it even beats pre-pre-training on natural text! Blog: hanseungwook.github.io/blog/nca-pre-p… (1/n)

English

262

1.7K

247.8K

Lars Ankile retweetledi

Tanishq Kumar@tanishqkumar07·4 Mar

I've been working on a new LLM inference algorithm. It's called Speculative Speculative Decoding (SSD) and it's up to 2x faster than the strongest inference engines in the world. Collab w/ @tri_dao @avnermay. Details in thread.

English

134

455

4.1K

606.6K

Lars Ankile retweetledi

Kushal@kushalk_·24 Şub

🤖 Can a single robot policy manipulate diverse tools without ever seeing them before? Introducing SimToolReal 🔨 : a generalist dexterous manipulation policy that transfers zero-shot sim→real to unseen tools + unseen tasks All videos are 1x speed (60 Hz control) 🧵👇

English

334

60.5K

Lars Ankile retweetledi

Zhanpeng He@zhanpeng_he·12 Şub

Excited to organize the ICRA 2026 Workshop on Reinforcement Learning in the Era of Imitation Learning (June 1 · Vienna 🇦🇹)! Imitation learning scales robot policies — but robustness & real-world adaptation remain open challenges. How can RL improve real-world robot performance? Speakers: @svlevine, @chelseabfinn, @pulkitology, @RobotPlatt, @davheld, @JasonMa2020, @GeorgiaChal More information: rl4il-icra.github.io Organizing w/ @stephentian_, @XiaomengXu11, @ric_and_robots, @albertyu101 #ICRA2026 #Robotics #ReinforcementLearning #ImitationLearning

English

5.6K

Lars Ankile@larsankile·13 Şub

@kevin_zakka you’re doing the lord’s work

English

218

Lars Ankile retweetledi

Kevin Zakka@kevin_zakka·13 Şub

Some exciting Friday news 🙂 We just open-sourced our system identification toolbox in MuJoCo 3.5. Get started today: "pip install mujoco[sysid]" mjlab v1.1 is also out featuring a brand new RGB-D renderer and now fully available on PyPI. Install with: "pip install mjlab"

English

152

9.2K

Lars Ankile@larsankile·13 Şub

@yuvaltassa Another day another slay from the mujoco team 🙏

English

258

Yuval Tassa@yuvaltassa·13 Şub

MuJoCo 3.5 is out, including the official MJWarp release, a new sysID toolbox and lots of other goodies! github.com/google-deepmin…

English

216

8.4K

Lars Ankile retweetledi

Brent Yi@brenthyi·6 Şub

New project! Flow Policy Gradients for Robot Control tldr; a simple online RL recipe for training and fine-tuning flow policies for robots co-led w/ @redstone_hong: hongsukchoi.github.io/fpo-control

English

102

606

72.3K

Lars Ankile@larsankile·3 Şub

my guy @claudeai where 5 at?

English

161

Lars Ankile@larsankile·25 Oca

@ZechenZhang5 How do you typically store and structure results in your repo?

English

Zechen Zhang@ZechenZhang5·24 Oca

How I use it: 1. Point Claude at my research repo 2. It explores code, results, docs 3. Drafts the paper using all these principles 4. Fetches BibTeX via APIs (never from memory) 5. I iterate with it in an IDE with pdf compiled every often like LaTeX workshop extension

English

3.6K

Zechen Zhang@ZechenZhang5·24 Oca

In the past I've read the advice on writing good ML papers from @NeelNanda5 @karpathy @seb_far and many others. So I thought: why not distill all of them into a Claude skill? Now I have an elite research writing partner at hand. Check it out for ICML! github.com/zechenzhangAGI…

English

756

83.2K

Lars Ankile retweetledi

Qiyang (Colin) Li@qiyang_li·24 Oca

Flow policies are getting popular in robotics as they capture multimodal prior data well, and synergize nicely with action chunking. But it is unclear how to best train them with RL effectively. We found something that works pretty well! (spoiler: use Adjoint Matching) 🧵1/N

English

452

86K

Keşfet

@marceltornev @Garmin @GarminFitness @BronarsToni @pulkitology @1x_tech @seungwookh @tri_dao