John Zhou

61 posts

John Zhou banner
John Zhou

John Zhou

@johnlyzhou

CS PhD student @UCLA, previously @Columbia | Scalable reinforcement learning

Los Angeles, CA Katılım Ağustos 2021
360 Takip Edilen136 Takipçiler
John Zhou retweetledi
Haoran Xu✈️ICLR26
Haoran Xu✈️ICLR26@ryanxhr·
Both offline RL and LLM RL fine-tuning can be formulated as behavior-regularized RL problems. We propose Value Grdient Flow (VGF), a new scalable and sample-efficient paradigam that treats behavior-regularized RL as an optimal transport problem. arxiv.org/abs/2604.14265 🧵[1/7]
GIF
English
3
23
176
13.3K
John Zhou retweetledi
Dan Lee
Dan Lee@Danicmhlee·
We're moving to a future vision of fully synthetic pre-training for LLMs. Our new work explores using Neural Cellular Automata to embed reasoning before language training even begins! I'm deeply grateful to @seungwookh, @akarshkumar0101, and @pulkitology for their mentorship, guidance, and deep insights throughout this work.
Seungwook Han@seungwookh

Can language models learn useful priors without ever seeing language? We pre-pre-train transformers on neural cellular automata — fully synthetic, zero language. This improves language modeling by up to 6%, speeds up convergence by 40%, and strengthens downstream reasoning. Surprisingly, it even beats pre-pre-training on natural text! Blog: hanseungwook.github.io/blog/nca-pre-p… (1/n)

English
0
3
11
1.5K
John Zhou retweetledi
Omar Rayyan
Omar Rayyan@omarrayyann·
MolmoSpaces provides singular scale and diversity. We built a benchmark that puts that scale to use. MolmoSpaces-Bench evaluates zero-shot policies across thousands of environments previously unseen to them under systematic variation, providing insights that go beyond a success rate % More Below:
Ai2@allen_ai

Introducing MolmoSpaces, a large-scale, fully open platform + benchmark for embodied AI research. 🤖 230k+ indoor scenes, 130k+ object models, & 42M annotated robotic grasps—all in one ecosystem.

English
6
16
156
13.1K
John Zhou retweetledi
Edward Hu
Edward Hu@edward_s_hu·
Nobody asked, but here's 4 world model papers that I read early on in my PhD which I still ponder over now. - Value Equivalence Principle - Learning Awareness Models - Embedded Agency (figure pic below), Big World Hypothesis See the thread for details:
Edward Hu tweet media
English
10
16
287
17.2K
John Zhou retweetledi
John Zhou retweetledi
Micah Goldblum
Micah Goldblum@micahgoldblum·
We built methods to handle both (1) and (2), but I’ll focus on a stupid simple trick that works particularly well: adversarial training. Adversarial training makes input gradients better behaved, in turn making gradient-based planning fast and easy. 8/11
Micah Goldblum tweet media
English
3
5
78
7.5K
John Zhou
John Zhou@johnlyzhou·
@zhiyuan_zhou_ Hey Paul, would love to meet up and learn more about the benefits of advantage conditioning!
English
0
0
2
115
Paul Zhou
Paul Zhou@zhiyuan_zhou_·
I’ll be at #NeurIPS2025 in San Diego! Happy to chat about advantage conditioning and RECAP, and the making of pi*06, and robot learning and RL in general. Also presenting two RL papers 👇
Paul Zhou@zhiyuan_zhou_

Very excited to finally share what I’ve been up to @physical_int for the past 6 months: developing advantage-conditioned VLAs! We are finally moving beyond imitating teleop data, and towards improving models with suboptimal deployment data using scalable real-world RL. 👇🧵

English
11
7
118
16.6K
John Zhou
John Zhou@johnlyzhou·
@TongheZhang01 I’ll be presenting Thursday morning but mostly free otherwise - happy to hash out times in DMs!
English
1
0
0
20
Tonghe Zhang
Tonghe Zhang@TongheZhang01·
@johnlyzhou oh that's wonderful! will be presenting poster on Friday but definitely we can talk earlier! what's your availability?
English
1
0
1
114
Tonghe Zhang
Tonghe Zhang@TongheZhang01·
If you are interested in RL, VLM, VLA, and efficient real world data collection for manipulators, come and chat with me at San Diego from Dec 3rd to 7th.
English
15
11
191
30.3K
John Zhou
John Zhou@johnlyzhou·
If you’re at #NeurIPS2025 from Tuesday to Sunday and interested in any of: offline RL, offline-to-online finetuning, VLM value functions/reward models/VLAs, or RL for real-world robots, please reach out and let’s chat!
English
0
2
10
1.1K
John Zhou
John Zhou@johnlyzhou·
@jaesikyoon_ Hi Jaesik, I really enjoyed your MCTD works and would love to chat more about it at NeurIPS!
English
1
0
1
105
Jaesik Yoon
Jaesik Yoon@jaesikyoon_·
I’ll be attending NeurIPS next week. Happy to connect and discuss ideas around diffusion-based planning, generative search, and reasoning with generative models!
English
1
1
10
558
John Zhou retweetledi
Chang Shi
Chang Shi@sshchang·
As a robotics researcher, I believe accurately modeling complex interactions between agents would be a big step for scaling up robot learning from unlabeled video. Looking forward to some inspiring discussion with the Cohere Labs Embodied AI community!
Cohere Labs@Cohere_Labs

Don't miss our Embodied AI group's session this week on November 21st with @sshchang for a presentation on "FLAM: Scaling Latent Action World Models with Factorization." Thanks to @nahidalam and Cole Harrison for organizing this event! ✨ Learn more: cohere.com/events/cohere-…

English
0
2
21
4.5K
John Zhou retweetledi
Seohong Park
Seohong Park@seohong_park·
I had a fun chat with @chris_j_paxton and @micoolcho about the scalability of RL for robotics!
RoboPapers@RoboPapers

Offline reinforcement learning is crucial for robotics, but does it scale? We talk to @seohong_park , who discusses how for long-horizon manipulation problems the answer may be no — at least not yet. But there are tricks that you can use to make it work effectively. Watch episode #38 of RoboPapers with @micoolcho and @chris_j_paxton now!

English
1
11
90
12.1K
John Zhou retweetledi
Seohong Park
Seohong Park@seohong_park·
Introducing *dual representations*! tl;dr: We represent a state by the "set of similarities" to all other states. This dual perspective has lots of nice properties and practical benefits in RL. Blog post: seohong.me/blog/dual-repr… Paper: arxiv.org/abs/2510.06714
Seohong Park tweet media
English
14
118
938
176.1K