David McAllister

173 posts

David McAllister banner
David McAllister

David McAllister

@davidrmcall

PhD Student @berkeley_ai

เข้าร่วม Haziran 2024
312 กำลังติดตาม899 ผู้ติดตาม
ทวีตที่ปักหมุด
David McAllister
David McAllister@davidrmcall·
Excited to share Flow Matching Policy Gradients: expressive RL policies trained from rewards using flow matching. It’s an easy, drop-in replacement for Gaussian PPO on control tasks.
English
8
200
1.2K
149.4K
David McAllister รีทวีตแล้ว
Junyi Zhang
Junyi Zhang@junyi42·
𝗢𝗻𝗲 𝗺𝗲𝗺𝗼𝗿𝘆 𝗰𝗮𝗻’𝘁 𝗿𝘂𝗹𝗲 𝘁𝗵𝗲𝗺 𝗮𝗹𝗹. We present 𝗟𝗼𝗚𝗲𝗥, a new 𝗵𝘆𝗯𝗿𝗶𝗱 𝗺𝗲𝗺𝗼𝗿𝘆 architecture for long-context geometric reconstruction. LoGeR enables stable reconstruction over up to 𝟭𝟬𝗸 𝗳𝗿𝗮𝗺𝗲𝘀 / 𝗸𝗶𝗹𝗼𝗺𝗲𝘁𝗲𝗿 𝘀𝗰𝗮𝗹𝗲, with 𝗹𝗶𝗻𝗲𝗮𝗿-𝘁𝗶𝗺𝗲 𝘀𝗰𝗮𝗹𝗶𝗻𝗴 in sequence length, 𝗳𝘂𝗹𝗹𝘆 𝗳𝗲𝗲𝗱𝗳𝗼𝗿𝘄𝗮𝗿𝗱 inference, and 𝗻𝗼 𝗽𝗼𝘀𝘁-𝗼𝗽𝘁𝗶𝗺𝗶𝘇𝗮𝘁𝗶𝗼𝗻. Yet it matches or surpasses strong optimization-based pipelines. (1/5) @GoogleDeepMind @Berkeley_AI
English
63
449
3.4K
549K
David McAllister รีทวีตแล้ว
Angjoo Kanazawa
Angjoo Kanazawa@akanazawa·
@brenthyi who worked on FPO/FPO++ is finishing his PhD and going on the job market 😭✨ He is also the person behind viser, pyroki, egoallo, jaxls, tyro and more! I can't express how amazing it is to have Brent on your team..! Any team would be incredibly lucky to have him!!
Angjoo Kanazawa@akanazawa

FPO++! We got RL on flow policies working on real robot tasks. Sim2real on humanoids trained from scratch + manipulation finetuning in sim with action chunking. Excited about this direction because we can now use RL with expressive policies to discover new behaviors!

English
2
13
106
9.8K
Anthropic
Anthropic@AnthropicAI·
We’ve identified industrial-scale distillation attacks on our models by DeepSeek, Moonshot AI, and MiniMax. These labs created over 24,000 fraudulent accounts and generated over 16 million exchanges with Claude, extracting its capabilities to train and improve their own models.
English
7.3K
6.3K
55K
33.6M
David McAllister รีทวีตแล้ว
Grace Luo
Grace Luo@graceluo_·
We trained diffusion models on a billion LLM activations, and we want you to use them! New preprint: Learning a Generative Meta-Model of LLM Activations Joint work with @feng_jiahai, @trevordarrell, @AlecRad, @JacobSteinhardt. More in thread 🧵
English
30
170
1.3K
189K
David McAllister รีทวีตแล้ว
Fanqi Lin
Fanqi Lin@lfqirrrrr·
𝑪𝒐-𝒕𝒓𝒂𝒊𝒏𝒊𝒏𝒈 is a promising way to scale Large Behavior Models (LBMs) beyond robot data, yet the data and training recipe are far from settled. 🤔 We present a large-scale empirical study leveraging 4,000h of robot/human data and 50M vision-language samples, evaluating 89 policies across 58,000 simulation rollouts and 2,835 real-world trials. 🤖📊 co-training-lbm.github.io Work done during my internship at @ToyotaResearch.
English
7
49
415
46.8K
David McAllister รีทวีตแล้ว
Brent Yi
Brent Yi@brenthyi·
tyro 1.0 is out 🐣 This has been a pet project/niche interest of mine for ~4 years now, so it's a bit of a sentimental moment... github.com/brentyi/tyro
English
11
22
181
42.2K
David McAllister รีทวีตแล้ว
Qiyang (Colin) Li
Qiyang (Colin) Li@qiyang_li·
Action chunking is drawing growing interest in RL, yet its theoretical properties are still understudied. We are excited to share some insights on when we should use action chunking in Q-learning + a new algo (DQC) to tackle hard long-horizon tasks!colinqiyangli.github.io/dqc🧵1/N
Qiyang (Colin) Li tweet media
English
6
54
303
62.6K
David McAllister รีทวีตแล้ว
Qianqian Wang
Qianqian Wang@QianqianWang5·
I'm recruiting multiple PhD students this cycle to join me at Harvard University and the Kempner Institute! My interests span vision and intelligence, including 3D/4D, active perception, memory, representation learning, and anything you're excited to explore! Deadline: Dec 15th.
English
25
154
932
174.2K
David McAllister รีทวีตแล้ว
Ben Recht
Ben Recht@beenwrekt·
In honor of the 39th AI Winter, I’m going to spend the week disentangling the culture and code of reinforcement learning. There may be ranting... argmin.net/p/reformist-re…
English
2
9
99
10.7K
David McAllister รีทวีตแล้ว
Delip Rao e/σ
Delip Rao e/σ@deliprao·
Hey @iclr_conf, reverting scores is unnecessary punishment for the majority of the authors who had nothing to do with this incident and had successful rebuttals. Instead of detecting collusions on your end (you have a ton of metadata) why is this everyone’s burden to bear?
Delip Rao e/σ tweet media
English
8
29
215
38.9K
David McAllister รีทวีตแล้ว
Ayaan Haque
Ayaan Haque@ayaanzhaque·
Excited to release Terminal Velocity Matching, our latest work on single-stage generative paradigms for one/few-step sampling. It’s SOTA on 1-step sampling, beats diffusion, and really works at scale! Tremendous work by @linqi_zhou, it's amazing how easy this was to train at large scale!
Luma@LumaLabsAI

Introducing Terminal Velocity Matching: a scalable, single-stage generative training method that delivers diffusion-level quality with a 25× fewer inference steps, now trained at 10B+ scale. lumalabs.ai/blog/engineeri…

English
3
4
46
6.7K
David McAllister รีทวีตแล้ว
tyler bonnen
tyler bonnen@tylerraye·
starting fall 2026 i'll be an assistant professor at @Penn 🥳 my lab will develop scalable models/theories of human behavior, focused on memory and perception currently recruiting PhD students in psychology, neuroscience, & computer science! reach out if you're interested😊
tyler bonnen tweet mediatyler bonnen tweet media
English
35
59
439
52.3K
David McAllister รีทวีตแล้ว
Ethan Weber
Ethan Weber@ethanjohnweber·
It's really great working with Jensen (@jensenzhoujh) on this effort. Please get in touch with us if you have experience in these post-training topics and would like to work with us! 🙌 🙂
Jensen Zhou@jensenzhoujh

We are looking for contributors for World Model Post-Training of foundational video models at Meta @AIatMeta! We are looking for talent with expertise in RL post-training, distillation, attention sparsification, diffusion model, and more to hop onboard. Candidates at all career levels are welcomed, whether students or not. We have immediate and flexible start dates for contractor positions. Onsite collaboration is possible in Zurich 🇨🇭, London 🇬🇧, or New York 🇺🇸. If you’re driven about advancing interactive spatial intelligence, we are here to talk - feel free to DM me and @ethanjohnweber.

English
1
2
15
3.8K
Hongsuk Benjamin Choi
Hongsuk Benjamin Choi@redstone_hong·
VideoMimic!
Saeejith Nair@sighjith

I built papiers.ai, a new interface for arXiv As we enter an era of accelerated scientific discovery, we need better tools that augment human cognition to help us keep up. Try it: visit papiers ai or swap arxiv -> papiers on any paper URL

Čeština
2
0
24
1.5K
Jia-Bin Huang
Jia-Bin Huang@jbhuang0604·
Proud advisor moment 😊 Congrats @Songwei_Ge for winning the Larry S. Davis Doctoral Dissertation Award @umdcs! Songwei is now cooking as a research scientist at @reve. Looking forward to amazing work!
English
12
4
90
23.1K