Hanyang Chen

28 posts

Hanyang Chen

@hc81Jeremy

M.S @ UIUC; Exploring some embodied ai stuff

Illinois, USA Katılım Haziran 2024

193 Takip Edilen82 Takipçiler

Hanyang Chen@hc81Jeremy·6 Kas

@nanjiang_cs 4 hours ago it’s theoretical online RL and now testing the theory

English

138

Nan Jiang@nanjiang_cs·6 Kas

quack quack

English

1.8K

Hanyang Chen retweetledi

Daniel Kang@ddkang·6 Kas

🤖 Feeling excited about the future of household robotic agents (i.e., embodied agents)? You should also consider their safety! 🔪Meet BEAT: the first visual backdoor attack on MLLM-based embodied agents. 🧵 1/7

English

9.1K

Hanyang Chen@hc81Jeremy·24 Eki

Its truly touching to see the solidarity of academic community in this wave of meta layoff

English

400

Hanyang Chen retweetledi

Manling Li@ManlingLi_·17 Eki

World Model Reasoning for VLM Agents (NeurIPS 2025, Score 5544) We release VAGEN to teach VLMs to build internal world models via visual state reasoning: - StateEstimation: what is the current state? - TransitionModeling: what is next? MDP → POMDP shift to handle the partial observability from visual states! mll.lab.northwestern.edu/VAGEN/ 🙌Led by @James_KKW @WilliamZhangNU @wzihanw @yaning_gao @LINJIEFUN @qineng_wang @hc81Jeremy @w4nanch1 @2prime_PKU @zhengyuan_yang lijuanwang,@RanjayKrishna @jiajunwu_cs @drfeifei @YejinChoinka 👍Grateful for the joint effort of @northwesterncs @uwcse @StanfordAILab @microsoft @WisconsinCS @siebelschool.

English

309

63.5K

Hanyang Chen@hc81Jeremy·16 Eki

Sincere appreciation to @ZihaoZH94437841 @RuiYang70669025 @EmpathYang @ExplainMiracles @James_KKW @jackbai_jkb @zhenhailongW, @rui4research ,Prof. ChengXiang Zhai(@TIMANUIUC) @hengjinlp @uiuc_nlp @ManlingLi_ @huan_zhang12 and Prof Tong Zhang. They made it possible.

English

256

Hanyang Chen@hc81Jeremy·16 Eki

🧵 Thread 4 🎬 Case Study After ERA training, a 3B model that originally failed every task can now reason and act step-by-step: (a) On EB-ALFRED, it reflects on earlier mistakes to finally clean and place the plate correctly. (b) On EB-Manipulation, it accurately fits the star into the right slot. 💭 Small models can just see clearly, think deeply, and act precisely as giants, but more efficient.

English

407

Hanyang Chen@hc81Jeremy·16 Eki

🤖️Today we introduce the Embodied Reasoning Agent (ERA), a framework that transforms a compact Vision Language Model (VLM) into a performant and efficient embodied agent. When large models like GPT-4o and Gemini show strong embodied performance on EmbodiedBench, smaller ones often fail completely. But, if a robot can’t fit a huge model, how can a small one still understand the world, plan tasks, and act precisely? 💡 That brings out ERA — By asking 1. What prior knowledge does embodied agent require before RL? and 2. What make RL in long-horizon embodied task stable and effective? We distill them into a unified post-training regime that is capable of delivering both high-level planning agent and low-level control agent, by different curation of training data. 🔗 Web link: embodied-reasoning-agent.github.io 📄 Pape link: arxiv.org/abs/2510.12693 🧵 Thread 1）

English

29.3K

Hanyang Chen@hc81Jeremy·19 Tem

@ManlingLi_ Appreciation for the continuous support, Manling.

English

Manling Li@ManlingLi_·19 Tem

Check @hc81Jeremy ’s oral presentation on EmbodiedBench at ICML! He is also applying to PhD this year, please reach out to him at ICML!

Hanyang Chen@hc81Jeremy

Grateful for the chance to present EmbodiedBench at ICML as an Oral. A rewarding experience full of learning. Thanks for @RuiYang70669025 @hengjinlp @jyzhang1208 @huan_zhang12 Mark_Zhao @ManlingLi_ Tong_Zhang and many others who make it possible. See you next time.

English

2.7K

Hanyang Chen@hc81Jeremy·19 Tem

English

5.5K

Hanyang Chen@hc81Jeremy·18 Tem

Thanks Prof. Yu to be at our talk @Zhou_Yu_AI. Happy ICML25🇨🇦

Zhou Yu@Zhou_Yu_AI

@RuiYang70669025 @hc81Jeremy Good work, enjoyed the talk.

English

1.4K

Hanyang Chen retweetledi

Zhou Yu@Zhou_Yu_AI·18 Tem

@RuiYang70669025 @hc81Jeremy Good work, enjoyed the talk.

English

1.7K

Hanyang Chen retweetledi

Rui Yang@RuiYang70669025·15 Tem

My coauthor @hc81Jeremy will present EmbodiedBench at ICML 2025! 🤖 Oral Session 6A 📍 West Hall C 🕧July 17 3:30-3:45 pmPDT 📌 Poster Session 📍 East Hall A-B #E-2411🕜 July 17 4:30-7 pm PDT Come say hi and let’s talk about VLM agent training, evaluation, and benchmarking! 😀

English

796

Hanyang Chen retweetledi

David Pfau@pfau·13 Tem

I'll be at ICML next week, presenting our paper on Wasserstein Policy Optimization on Tuesday! If you're in Vancouver, come say hi!

English

566

35.8K

Hanyang Chen@hc81Jeremy·8 Haz

Excited to share that EmbodiedBench was selected for an Oral at ICML 2025! We recently added results for new models (InternVL3, Gemma3, Ovis2) and released a large agent trajectory dataset on 🤗: embodiedbench.github.io Try training and evaluating your MLLM for embodied agent!

English

176

Hanyang Chen retweetledi

Jia-Bin Huang@jbhuang0604·10 Ara

How to schedule a meeting? When you ask for a meeting with others, you are asking for their time. You are asking for their most valuable, finite resource to benefit yourself (e.g., for advice, networking, questions, and opportunities). Here are some tips that I found useful.

English

142

23K

Hanyang Chen retweetledi

Hanning Zhang@HanningZhangHK·17 Şub

🚀 Excited to share our latest work on Iterative-DPO for math reasoning! Inspired by DeepSeek-R1 & rule-based PPO, we trained Qwen2.5-MATH-7B on Numina-Math prompts. Our model achieves 47.0% pass@1 on AIME24, MATH500, AMC, Minerva-Math, OlympiadBench—outperforming LLaMA-3.1-70B-Instruct and approaching Eurus-2-7B-PRIME! With SFT warm-up + Iterative-DPO, we reached 51.8%, surpassing Qwen2.5-7B-SimpleRL-Zero and matching our PPO-Zero. 🔍 Key takeaways: 1️⃣ NLL loss doesn't help DPO. 2️⃣ DPO on MATH alone saturates—diverse prompts (Numina-Math) help. 3️⃣ RAFT (Rejection Sampling Finetuning) is simple & effective for rule-based RL. 4️⃣ Long CoT for warm-up SFT significantly improves DPO, reaching PPO-level performance. 5️⃣ Base models already show self-reflection ("aha moments"), RL doesn’t boost this. 6️⃣ Response length changes are inconsistent, no upward trend. 7️⃣ PPO still wins—DPO & RAFT improve models but fall slightly short of PPO (51.8%). 📜 Code & models all open-sourced! Try them out & share feedback! 🔗 GitHub: github.com/RLHFlow/Online… 🔗 Notion: efficient-unicorn-451.notion.site/Online-DPO-R1-…

English

122

10K

Hanyang Chen@hc81Jeremy·14 Şub

Thanks, Manling, for sharing the work! Improve your VLM with EmbodiedBench.

Manling Li@ManlingLi_

Excited to release EmbodiedBench for VLMs! It is time to work on embodied agents using VLMs🔥 embodiedbench.github.io 🔍 1,128 tasks across 4 diverse environments 🎯 6 fine-grained evaluation capabilities (reasoning, planning, perception & more) 📊 Benchmarked on 13 top models

English

213

Hanyang Chen@hc81Jeremy·14 Şub

🔥Exploring MLLM as Embodied Generalist. 🔍 4 diverse tasks - from High Level Planning to Low Level Manipulation 🎯 6 fine-grained evaluation capabilities ALL IN ONE MLLM. 📊 More than a Benchmark - A standardized platform for more algorithms to sparks.

Rui Yang@RuiYang70669025

🤖Can MLLM agents reason about spatial relationships and plan atomic actions for navigation & manipulation? 🔥 Meet EmbodiedBench 🏆—the first fine-grained benchmark for MLLM-based embodied agents! 📄 Paper: arxiv.org/abs/2502.09560 🌐 Website & code: embodiedbench.github.io

English

216

Keşfet

@nanjiang_cs @James_KKW @WilliamZhangNU @yaning_gao @LINJIEFUN @qineng_wang @w4nanch1 @2prime_PKU