Hanyang Chen

28 posts

Hanyang Chen banner
Hanyang Chen

Hanyang Chen

@hc81Jeremy

M.S @ UIUC; Exploring some embodied ai stuff

Illinois, USA Katılım Haziran 2024
193 Takip Edilen82 Takipçiler
Hanyang Chen
Hanyang Chen@hc81Jeremy·
@nanjiang_cs 4 hours ago it’s theoretical online RL and now testing the theory
English
0
0
2
138
Nan Jiang
Nan Jiang@nanjiang_cs·
quack quack
Nan Jiang tweet media
English
1
1
11
1.8K
Hanyang Chen retweetledi
Daniel Kang
Daniel Kang@ddkang·
🤖 Feeling excited about the future of household robotic agents (i.e., embodied agents)? You should also consider their safety! 🔪Meet BEAT: the first visual backdoor attack on MLLM-based embodied agents. 🧵 1/7
English
1
7
20
9.1K
Hanyang Chen
Hanyang Chen@hc81Jeremy·
Its truly touching to see the solidarity of academic community in this wave of meta layoff
English
0
0
1
400
Hanyang Chen retweetledi
Manling Li
Manling Li@ManlingLi_·
World Model Reasoning for VLM Agents (NeurIPS 2025, Score 5544) We release VAGEN to teach VLMs to build internal world models via visual state reasoning: - StateEstimation: what is the current state? - TransitionModeling: what is next? MDP → POMDP shift to handle the partial observability from visual states! mll.lab.northwestern.edu/VAGEN/ 🙌Led by @James_KKW @WilliamZhangNU @wzihanw @yaning_gao @LINJIEFUN @qineng_wang @hc81Jeremy @w4nanch1 @2prime_PKU @zhengyuan_yang lijuanwang,@RanjayKrishna @jiajunwu_cs @drfeifei @YejinChoinka 👍Grateful for the joint effort of @northwesterncs @uwcse @StanfordAILab @microsoft @WisconsinCS @siebelschool.
English
3
70
309
63.5K
Hanyang Chen
Hanyang Chen@hc81Jeremy·
🧵 Thread 4 🎬 Case Study After ERA training, a 3B model that originally failed every task can now reason and act step-by-step: (a) On EB-ALFRED, it reflects on earlier mistakes to finally clean and place the plate correctly. (b) On EB-Manipulation, it accurately fits the star into the right slot. 💭 Small models can just see clearly, think deeply, and act precisely as giants, but more efficient.
Hanyang Chen tweet media
English
1
1
2
407
Hanyang Chen
Hanyang Chen@hc81Jeremy·
🤖️Today we introduce the Embodied Reasoning Agent (ERA), a framework that transforms a compact Vision Language Model (VLM) into a performant and efficient embodied agent. When large models like GPT-4o and Gemini show strong embodied performance on EmbodiedBench, smaller ones often fail completely. But, if a robot can’t fit a huge model, how can a small one still understand the world, plan tasks, and act precisely? 💡 That brings out ERA — By asking 1. What prior knowledge does embodied agent require before RL? and 2. What make RL in long-horizon embodied task stable and effective? We distill them into a unified post-training regime that is capable of delivering both high-level planning agent and low-level control agent, by different curation of training data. 🔗 Web link: embodied-reasoning-agent.github.io 📄 Pape link: arxiv.org/abs/2510.12693 🧵 Thread 1)
Hanyang Chen tweet mediaHanyang Chen tweet media
English
1
17
85
29.3K
Hanyang Chen retweetledi
Rui Yang
Rui Yang@RuiYang70669025·
My coauthor @hc81Jeremy will present EmbodiedBench at ICML 2025! 🤖 Oral Session 6A 📍 West Hall C 🕧July 17 3:30-3:45 pmPDT 📌 Poster Session 📍 East Hall A-B #E-2411🕜 July 17 4:30-7 pm PDT Come say hi and let’s talk about VLM agent training, evaluation, and benchmarking! 😀
Rui Yang tweet mediaRui Yang tweet media
English
3
4
11
796
Hanyang Chen retweetledi
David Pfau
David Pfau@pfau·
I'll be at ICML next week, presenting our paper on Wasserstein Policy Optimization on Tuesday! If you're in Vancouver, come say hi!
David Pfau tweet media
English
10
55
566
35.8K
Hanyang Chen
Hanyang Chen@hc81Jeremy·
Excited to share that EmbodiedBench was selected for an Oral at ICML 2025! We recently added results for new models (InternVL3, Gemma3, Ovis2) and released a large agent trajectory dataset on 🤗: embodiedbench.github.io Try training and evaluating your MLLM for embodied agent!
Hanyang Chen tweet media
English
1
0
1
176
Hanyang Chen retweetledi
Jia-Bin Huang
Jia-Bin Huang@jbhuang0604·
How to schedule a meeting? When you ask for a meeting with others, you are asking for their time. You are asking for their most valuable, finite resource to benefit yourself (e.g., for advice, networking, questions, and opportunities). Here are some tips that I found useful.
English
3
18
142
23K
Hanyang Chen retweetledi
Hanning Zhang
Hanning Zhang@HanningZhangHK·
🚀 Excited to share our latest work on Iterative-DPO for math reasoning! Inspired by DeepSeek-R1 & rule-based PPO, we trained Qwen2.5-MATH-7B on Numina-Math prompts. Our model achieves 47.0% pass@1 on AIME24, MATH500, AMC, Minerva-Math, OlympiadBench—outperforming LLaMA-3.1-70B-Instruct and approaching Eurus-2-7B-PRIME! With SFT warm-up + Iterative-DPO, we reached 51.8%, surpassing Qwen2.5-7B-SimpleRL-Zero and matching our PPO-Zero. 🔍 Key takeaways: 1️⃣ NLL loss doesn't help DPO. 2️⃣ DPO on MATH alone saturates—diverse prompts (Numina-Math) help. 3️⃣ RAFT (Rejection Sampling Finetuning) is simple & effective for rule-based RL. 4️⃣ Long CoT for warm-up SFT significantly improves DPO, reaching PPO-level performance. 5️⃣ Base models already show self-reflection ("aha moments"), RL doesn’t boost this. 6️⃣ Response length changes are inconsistent, no upward trend. 7️⃣ PPO still wins—DPO & RAFT improve models but fall slightly short of PPO (51.8%). 📜 Code & models all open-sourced! Try them out & share feedback! 🔗 GitHub: github.com/RLHFlow/Online… 🔗 Notion: efficient-unicorn-451.notion.site/Online-DPO-R1-…
Hanning Zhang tweet mediaHanning Zhang tweet media
English
3
24
122
10K
Hanyang Chen
Hanyang Chen@hc81Jeremy·
🔥Exploring MLLM as Embodied Generalist. 🔍 4 diverse tasks - from High Level Planning to Low Level Manipulation 🎯 6 fine-grained evaluation capabilities ALL IN ONE MLLM. 📊 More than a Benchmark - A standardized platform for more algorithms to sparks.
Rui Yang@RuiYang70669025

🤖Can MLLM agents reason about spatial relationships and plan atomic actions for navigation & manipulation? 🔥 Meet EmbodiedBench 🏆—the first fine-grained benchmark for MLLM-based embodied agents! 📄 Paper: arxiv.org/abs/2502.09560 🌐 Website & code: embodiedbench.github.io

English
0
0
5
216