Zhan Xianyuan

51 posts

Zhan Xianyuan

@atLargeIC

Associate professor at AIR (@AIRTHU1201), Tsinghua University (@Tsinghua_Uni)

Entrou em Ağustos 2010

158 Seguindo81 Seguidores

Zhan Xianyuan@atLargeIC·5 Mar

Want to know the best practices to construct a diffusion-based E2E model for autonomous driving? Check our latest work HDP! We provide systematic analyses on a wide range of design choices in diffusion-based modeling, with 200+km real-vehicle testing. The devil is in the details!

Zheng Yinan@ZhengYinan2001

Excited to share our new work: Hyper Diffusion Planner (HDP)! We systematically unleash the potential of diffusion models for E2E AD — rethinking IL pre-training, and RL post-training. 10× closed-loop improvement over the base model, validated on 200 km of real-world driving.

English

Zhan Xianyuan retweetou

Jianxiong Li@NeurIPS 2025@Facebear_ljx·5 Ara

Cool! Very happy that X-VLA can be integrated in Lerobot! Hope it can be helpful for future research~

LeRobot@LeRobotHF

🚀 Introducing X-VLA ; LeRobot’s new soft-prompted Vision-Language-Action model. X-VLA is built to scale across many embodiments: different robots, cameras, action spaces, and environments, all handled by one unified transformer backbone. - Generalist across robots (Franka, WidowX, Agibot, sim + real) - Soft-prompt domain IDs let the model adapt to new hardware with tiny learnable embeddings - Flow-matching + transformer core for smooth, continuous 50 Hz control - Pretrained on a mixed-embodiment dataset spanning 7+ platforms and diverse tasks - Fine-tune on any dataset using one of the 6 checkpoints we provide out of the box.

English

344

Zhan Xianyuan retweetou

LeRobot@LeRobotHF·4 Ara

English

412

60.3K

Zhan Xianyuan@atLargeIC·23 Eki

X-VLA has won FIRST PLACE in the IROS 2025 AGIBOT World Challenge! Congrats to all my wonderful students!🎉🎉🎉

Zhan Xianyuan@atLargeIC

Excited to introduce X-VLA! Our latest cross-embodied VLA model with remarkable sample & parameter efficiency. With only 0.9B parameters, X-VLA establishes new SOTA on a wide range of mainstream embodied benchmarks!

English

566

Zhan Xianyuan@atLargeIC·19 Eki

Jinliang Zheng@2_toinf

🚀 New: X-VLA (0.9B)-A Lightweight VLA model for cross-embodiment manipulation 🧺 2+ hrs cloth folding autonomy, trained with only 1.5K demos 📊 Benchmarks: 96% Simpler-WidowX, 97.4% Libero-Long, 70% RoboTwin, + more Project Page: thu-air-dream.github.io/X-VLA/ (1/N)

English

746

Zhan Xianyuan@atLargeIC·18 Eki

A great mind... RIP🙏

Bloomberg@business

Chen Ning Yang, a Nobel Prize-winning physicist who gave up his US citizenship to become a citizen of China in 2015 and helped persuade other scientists to do the same, passed away. He was 103. bloomberg.com/news/articles/…

English

142

Zhan Xianyuan retweetou

Moritz Reuss@moritz_reuss·14 Eki

VLAs have become the fastest-growing subfield in robot learning. So where are we now? After reviewing ICLR 2026 submissions and conversations at CoRL, I wrote an overview of the current state of VLA research with some personal takes: is.gd/1pqw9w

English

105

532

53.1K

Zhan Xianyuan retweetou

Puneesh Deora@puneeshdeora·14 May

Global reaction to Overleaf going down

English

637

27.7K

Zhan Xianyuan retweetou

Josh Clymer@joshua_clymer·25 Nis

AI R&D automation is often cited as a risk, but what’s the threat model exactly? There are several! – and they are often conflated. I co-wrote a paper with 13 international experts that breaks down AI R&D threat models, ‘red line’ thresholds, and mitigations.

English

5.9K

Zhan Xianyuan@atLargeIC·1 Nis

@harshit_sikchi @scottniekum @yayitsamyzhang @marcgbellemare @yukez @PeterStone_TX Congrats Harshit!

English

Harshit Sikchi ✈️ ICLR 26@harshit_sikchi·1 Nis

Successfully defended my Ph.D. today 🎓🥳! @scottniekum and @yayitsamyzhang are the best advisors I could have ever asked for. A big thanks to my committee members @marcgbellemare @yukez @PeterStone_TX . The full presentation video will be uploaded soon... Excited about what's to come!

English

198

11.1K

Zhan Xianyuan@atLargeIC·30 Oca

Glad to share another great work from our lab: Diffusion Planner! The SOTA planning model for autonomous driving. Rule-based refinement may become history for learning-based autonomous driving models, thanks to the powerful diffusion models!

Zheng Yinan@ZhengYinan2001

🥳New SOTA on nuPlan! #ICLR2025 🚗Diffusion Planner leverages diffusion models with a specifically designed architecture for high-performance motion planning, reducing dependence on rule-based refinement. 🔗 zhengyinan-air.github.io/Diffusion-Plan…

English

164

Zhan Xianyuan retweetou

Andrej Karpathy@karpathy·29 Oca

TinyZero reproduction of R1-Zero "experience the Ahah moment yourself for < $30" Given a base model, the RL finetuning can be relatively very cheap and quite accessible.

Jiayi Pan@jiayi_pirate

We reproduced DeepSeek R1-Zero in the CountDown game, and it just works Through RL, the 3B base LM develops self-verification and search abilities all on its own You can experience the Ahah moment yourself for < $30 Code: github.com/Jiayi-Pan/Tiny… Here's what we learned 🧵

English

391

3.3K

386.8K

Zhan Xianyuan retweetou

Haoyi Niu@t641769919·28 Oca

🥳Happy to announce H2O+ is accepted at #ICRA2025. Removing the over-conservatism and facilitating exploration in simulation, H2O+ is open to any Offline/Online RL design. Stay tuned for more updates! We will release our code in the following month. Happy Chinese New Year!🎊🏮🐍

Haoyi Niu@t641769919

📢Thrilled to have one paper accepted at #NeurIPS2022🥂with awesome teammates @HappyyPablo @EvieQ_01 advised by @atLargeIC💐Hope the Hybrid Offline-and-Online setting can shed new light on attractive solutions to the real-world deployment for RL. arxiv.org/abs/2206.13464

English

701

Zhan Xianyuan retweetou

Vedang Vatsa FRSA@vedangvatsa·27 Oca

🧵 Hidden Gems in DeepSeek-R1’s Paper

English

145

611.5K

Zhan Xianyuan@atLargeIC·20 Oca

Introducing UniAct! One of my favorite works finished last year in my lab. UniAct enables learning universal actions to power ANY robotic embodiments, physical meaning, and control interfaces. The project page is at: 2toinf.github.io/UniAct/. Stay tuned for more exciting news!!!

Jinliang Zheng@2_toinf

#Robotic_Foundation_Models #Universal_Actions 0.5B beats 7B models🦾! We introduce a highly effective pretraining paradigm for embodied foundation model, powered by a Universal Action Space that boosts cross-embodiment data sharing and generalization! 2toinf.github.io/UniAct/

English

129

Zhan Xianyuan retweetou

Zihan "Zenus" Wang ✈️ ICLR@wzenus·28 Ara

[Long Tweet Ahead] I just have to say, I’m genuinely impressed by DeepSeek. 💡 It’s no wonder their reports are so elegant and fluffless. Here’s what I noticed about their culture, a space where real innovation thrives, during my time there ↓ — — — — — 🌟 1. Be nice and careful to talents - The recruiting teams seek top talent from China & globally. Many are PhD / grad / undergrads from Chinese top 10 universities e.g., Tsinghua / Peking University. - Hiring is minimalist: My interview took only a few rounds. They basically check two criteria: Do you genuinely WANT to push fundamental AI problems forward? CAN you make it happen (at least one standout skill + solid skills to get things done)? - Roles seem shaped around the talent, instead of vice versa. Not like “we need a role, so we find a talent”, they basically ask: “Here’s an exceptional talent; how can they contribute?” This can lead to something unconventional: they can hire someone with expertise in MBTI who finally focuses on creating more personalized / role-playing models. - Something basic: Top-tier benefits in China, including for interns, allowing them to concentrate on work matters and worry less about material concerns. 🤝 2. Individualized HR culture - With above talent-first hiring logistics, even with a 200-people scale, I still feel everyone is unique and there is no such thing like a standardization where everyone can be replaced like a cog-in-machine. - No pressure or forced KPIs. I hardly feel any sense like “this must be done by this Thursday” from my mentor / seniors / colleagues. - Being collaborative. DeepSeek tries its best to forbid race inside the company. It’s like everyone contributes to the final model with their own (orthogonal) ideas and everyone hopes their idea is useful. If an idea is proved useful, everyone celebrates, and everyone is happy about it. ⚙️ 3. Disentangled development systems - DeepSeek covers a highly diverse set of talent directions. It’s like how “expert specialization” happens in their MoE models. People focus on what they’re best at, and it’s natural to ask others things out of their expertise. Helping others with one's expertise is not what people only do after completing their own work. - There is a shared basic pipeline that works pretty well for everyone. When a group adds new things to the system, they do really good documentation so others can know what happens in a minute and how it affects their own roles (most of the time, this won’t affect their work; they just feel things improve automatically). - Feedback loops are FAST: To verify whether ideas could work, is basically just to test whether it could work on the super-latest simplified baseline. I strongly feel whenever I have an idea in the morning, I can realize whether it’s effective in the afternoon -- no organization approval, no hard GPU utilization restrictions, little debugging (thanks to the rigorously debugged baseline), just try to seamlessly add my own idea to the model. This makes working there super reflective and feedback-rich at the beginning of an idea, even if many ablations are required later to finally merge the idea to the giant model. So all of the above makes the organization super Spontaneous-person-friendly, and maybe this is why you can always trust their tech paths even when many improvements / ideas are applied in each single model release. I do appreciate such disentangled organization, which makes fast and solid iterations at different angles in the model. 4. 🌍Diversity sparks innovation It’s not really about something like “we must consider every party”. They pay attention to inclusion but it’s not the biggest matter. The biggest matter lies in “How can people from diverse backgrounds contribute to the DeepSeek model?” I have many colleagues called know-it-all “百晓生”, a role-of-talent that DeepSeek hires. As an AI company, it’s interesting to see so many AI developers just from literature / social science backgrounds. They know little about machine learning formulas and could understand model training based on their intuition of babysitting a child. It’s fun to discuss Zhenhuan Zhuan (a Chinese history drama) during lunch and do a lot of mind-practice like how to survive in a squid game. The initial idea of this role-of-talent is to build a global knowledge base on history, culture, and science to expand AGI capabilities. However, I do feel how they contribute to working efficiency / nurturing ideas of all the team, at least, making everyone happy and more focused when getting back to work from lunch. — — — — — Something random I hope to share at the end: It’s fun to solve some challenges to realize individual value or get a sense of achievement. In fact, it matters what “challenge” you are facing. The “challenge” here could just be “how to achieve AGI” – in such case, you actually do not need to worry too much about “what if this idea has been tried by someone else”, “what if someone achieves AGI faster than me”, “what if this idea is too simple” or “what if someone get paid more than me” – things many are indeed worried about. When what someone care is about achieving AGI, they could just try relentlessly about what is really useful and incorporate them into the model. — — — — — Resources and References: Two interviews with DeepSeek founder Liang Wenfeng: drive.google.com/file/d/1DW5ohZ… drive.google.com/file/d/1gLw9jp… chinatalk.media/p/deepseek-ceo… DeepSeek hiring ads: x.com/deepseek_ai/st… liepin.com/job/1959357241… And my experiences there.

English

199

1.4K

265.8K

Zhan Xianyuan retweetou

Jianxiong Li@NeurIPS 2025@Facebear_ljx·10 Ara

I'll present RSP, and other 3 papers about Multimodal Instruction Masking, Robotics Representation Learning, and Decision Making Data Editting in NeurIPS 2024. See you in Vancouver!

Haoyi Niu@t641769919

We are working hard preparing the arXiv version—stay tuned for updates! For early access, catch RSP at NeurIPS'24 OWA Workshop poster session. Jianxiong @Facebear_ljx will be there, feel free to stop by and check it out!🧵(10/10) ⏰12.15 morning 📍East Building - MTG 1-3 + S.FOY

English

255

Zhan Xianyuan@atLargeIC·10 Ara

RSP will also be presented in the poster session of the NeurIPS 2024 Workshop on Open-World Agents (OWA), welcome to check it out!

Zhan Xianyuan@atLargeIC

Offline RL models are becoming increasingly big nowadays, but are these heavy models truly necessary? Introducing our recent work RSP (AAAI 2025). We show that simply using shallow MLPs could work equally well or even better! The trick is to model it in a recursive way!

English

Zhan Xianyuan@atLargeIC·10 Ara

Haoyi Niu@t641769919

🚨Recursive Skip-Step Planning (RSP) Relying on larger, expressive models for sequential decision-making has recently become a popular choice, but are they truly necessary? Can we replace these heavy models? Yes—RSP empowers shallow MLPs to excel in long-horizon tasks!🧵(1/n)

English

227

Zhan Xianyuan retweetou

Haoyi Niu@t641769919·10 Şub

Domains (environments/embodiments) where we train policies often differ from those where we deploy them, leading to transfer challenges arising from domain gaps.🥳Excited to announce our latest work: A Comprehensive Survey of Cross-Domain Policy Transfer for Embodied Agents🤖➡️🌏

English

1.2K

Descobrir

@harshit_sikchi @scottniekum @yayitsamyzhang @marcgbellemare @yukez @PeterStone_TX @elonmusk @BarackObama