Josh McClellan

36 posts

Josh McClellan

@JoshMcClellan0

I research generalization and multi-agent systems in reinforcement learning

Katılım Nisan 2020

80 Takip Edilen20 Takipçiler

Josh McClellan@JoshMcClellan0·9 Mar

@ColeJacksonFB Look at his pff 2024 though (not 23 my bad) "Simpson had career-high grades across the board, earning a 77.3 overall grade, the 13th highest among guards, and a 79.2 run blocking grade, which also ranked top 10 at the position."

English

Cole Jackson@ColeJacksonFB·9 Mar

For those of you that like PFF rankings: John Simpson ranked 41st among NFL Guards in 2025. For reference: LG Andrew Vorhees ranked 47th. RG Daniel Faalele ranked 42nd. #RavensFlock

English

197

70.9K

Josh McClellan@JoshMcClellan0·9 Mar

@ColeJacksonFB Oops sorry, meant 2024. According to pff, which is what you posted, "Simpson had career-high grades across the board, earning a 77.3 overall grade, the 13th highest among guards, and a 79.2 run blocking grade, which also ranked top 10 at the position."

English

120

Cole Jackson@ColeJacksonFB·9 Mar

@JoshMcClellan0 No he didnt.

English

527

Josh McClellan retweetledi

Robert Youssef@rryssf_·7 Şub

ICLR 2025 just gave an Outstanding Paper Award to a method that fixes model editing with one line of code 🤯 here's the problem it solves: llms store facts in their parameters. sometimes those facts are wrong or outdated. "model editing" lets you surgically update specific facts without retraining the whole model. the standard approach: find which parameters encode the fact (using causal tracing), then nudge those parameters to store the new fact. works great for one edit. but do it a hundred times in sequence and the model starts forgetting everything else. do it a thousand times and it degenerates into repetitive gibberish. every edit that inserts new knowledge corrupts old knowledge. you're playing whack-a-mole with the model's memory. AlphaEdit reframes the problem. instead of asking "how do we update knowledge with less damage?" the authors ask "how do we make edits mathematically invisible to preserved knowledge?" the trick: before applying any parameter change, project it onto the null space of the preserved knowledge matrix. in plain english: find the directions in parameter space where you can move freely without affecting anything the model already knows. only move in those directions. it's like remodeling one room in a house by only touching walls that aren't load-bearing. the rest of the structure doesn't even know anything changed. the results from Fang et al. across GPT2-XL, GPT-J, and LLaMA3-8B: > average 36.7% improvement over existing editing methods > works as a plug-and-play addition to MEMIT, ROME, and others > models maintain 98.48% of general capabilities after 3,000 sequential edits > prevents the gibberish collapse that kills other methods at scale and the implementation is literally one line of code added to existing pipelines. what i find genuinely elegant: the paper proves mathematically that output remains unchanged when querying preserved knowledge. this isn't "it works better in practice." it's "we can prove it doesn't touch what it shouldn't." the honest caveats: largest model tested was LLaMA3-8B. nobody's shown this works at 70B+ scale yet. a follow-up paper (AlphaEdit+) flagged brittleness when new knowledge directly conflicts with preserved knowledge, which is exactly the hardest case in production. and the whole approach assumes causal tracing correctly identifies where facts live, which isn't always clean. but as a core insight, this is the kind of work that deserves the award. not because it solves everything. because it changes the question. the era of "edit and pray" for llm knowledge updates might actually be ending.

English

615

51K

Josh McClellan retweetledi

Max Simchowitz@max_simchowitz·3 Ara

⏰⏰ New Science of Robot Learning Paper: "Much Ado About Noising." TL;DR we answer why generative models, like flow and diffusion models, actually work for robotic control tasks🤖🤖 (hint: its not multimodality). This leads to a new minimal iterative policy (MIP) that matches flow models with much faster inference🚄🚄 Check out @ChaoyiPan 's thread and paper to find out more. Amazing work by @ChaoyiPan, together with @GuanyaShi , @nmboffi , @guannanqu. Come find us at NeurIPS to chat more!

Chaoyi Pan@ChaoyiPan

Generative models (diffusion/flow) are taking over robotics 🤖. But do we really need to model the full action distribution to control a robot? We suspected the success of Generative Control Policies (GCPs) might be "Much Ado About Noising." We rigorously tested the myths. 🧵👇

English

200

19.5K

Josh McClellan retweetledi

Andrew Davison@AjdDavison·3 Ağu

I still don't believe that Spatial AI is a big data problem. Massive data won't unlock robotics. Representation is still the hard bit. We need efficient, composable representation of 3D physical scenes to enable mental simulation like humans use to plan creative uses of objects.

Andrew Davison@AjdDavison

Seems like a good moment to tweet a few quotes from my 2018 position paper: FutureMapping: The Computational Structure of Spatial AI Systems. #SpatialAI arxiv.org/abs/1803.11288

English

354

43.8K

Josh McClellan retweetledi

Michael Lutter@_mlutter·12 Mar

My team at Boston Dynamics has open Research Scientist positions! We are hiring full-time and interns. If you are excited about RL & humanoids, we want to get to know you! Research Scientist: bostondynamics.wd1.myworkdayjobs.com/Boston_Dynamic… Research Scientist Intern: bostondynamics.wd1.myworkdayjobs.com/Boston_Dynamic…

English

318

34.3K

Josh McClellan retweetledi

François Chollet@fchollet·20 Mar

Much of the field obsesses over end-to-end learning. But strong generalization requires compositionality: building modular, reusable abstractions, and reassembling them on the fly when faced with novelty. The models of the future won't be just pipes, they will be Lego castles.

English

166

1.3K

136.5K

Josh McClellan@JoshMcClellan0·8 Ara

@TacoCohen I would love to chat if you have time. We will also be presenting a relevant poster on using equivariance in Multi agent RL arxiv.org/abs/2410.02581

English

143

Taco Cohen@TacoCohen·8 Ara

On my way to NeurIPS! Looking forward to meeting old friends and making new ones. LMK if you're into codegen and RL and want to chat!

English

Josh McClellan@JoshMcClellan0·6 Ara

#NeurIPS #MARL #AI #ReinforcementLearning #MachineLearning #Equivariance #GraphNeuralNetworks

QHT

Josh McClellan@JoshMcClellan0·6 Ara

This robustness stems directly from its symmetry guarantees, allowing it to lose less performance when adapting to new scenarios. If you'll be at Neurips come visit our poster next week to learn more and discuss the exciting future of MARL!

English

Josh McClellan@JoshMcClellan0·6 Ara

I'm excited to share that our paper, "Boosting Sample Efficiency and Generalization in Multi-agent Reinforcement Learning via Equivariance," has been accepted to NeurIPS 2024! 🎉 arxiv.org/abs/2410.02581 @ptokekar @furongh