Zhan Xianyuan

51 posts

Zhan Xianyuan banner
Zhan Xianyuan

Zhan Xianyuan

@atLargeIC

Associate professor at AIR (@AIRTHU1201), Tsinghua University (@Tsinghua_Uni)

शामिल हुए Ağustos 2010
158 फ़ॉलोइंग81 फ़ॉलोवर्स
Zhan Xianyuan
Zhan Xianyuan@atLargeIC·
Want to know the best practices to construct a diffusion-based E2E model for autonomous driving? Check our latest work HDP! We provide systematic analyses on a wide range of design choices in diffusion-based modeling, with 200+km real-vehicle testing. The devil is in the details!
Zheng Yinan@ZhengYinan2001

Excited to share our new work: Hyper Diffusion Planner (HDP)! We systematically unleash the potential of diffusion models for E2E AD — rethinking IL pre-training, and RL post-training. 10× closed-loop improvement over the base model, validated on 200 km of real-world driving.

English
0
0
3
83
Zhan Xianyuan रीट्वीट किया
Zhan Xianyuan रीट्वीट किया
LeRobot
LeRobot@LeRobotHF·
🚀 Introducing X-VLA ; LeRobot’s new soft-prompted Vision-Language-Action model. X-VLA is built to scale across many embodiments: different robots, cameras, action spaces, and environments, all handled by one unified transformer backbone. - Generalist across robots (Franka, WidowX, Agibot, sim + real) - Soft-prompt domain IDs let the model adapt to new hardware with tiny learnable embeddings - Flow-matching + transformer core for smooth, continuous 50 Hz control - Pretrained on a mixed-embodiment dataset spanning 7+ platforms and diverse tasks - Fine-tune on any dataset using one of the 6 checkpoints we provide out of the box.
LeRobot tweet media
English
10
79
412
60.3K
Zhan Xianyuan
Zhan Xianyuan@atLargeIC·
Excited to introduce X-VLA! Our latest cross-embodied VLA model with remarkable sample & parameter efficiency. With only 0.9B parameters, X-VLA establishes new SOTA on a wide range of mainstream embodied benchmarks!
Jinliang Zheng@2_toinf

🚀 New: X-VLA (0.9B)-A Lightweight VLA model for cross-embodiment manipulation 🧺 2+ hrs cloth folding autonomy, trained with only 1.5K demos 📊 Benchmarks: 96% Simpler-WidowX, 97.4% Libero-Long, 70% RoboTwin, + more Project Page: thu-air-dream.github.io/X-VLA/ (1/N)

English
0
0
1
746
Zhan Xianyuan रीट्वीट किया
Moritz Reuss
Moritz Reuss@moritz_reuss·
VLAs have become the fastest-growing subfield in robot learning. So where are we now? After reviewing ICLR 2026 submissions and conversations at CoRL, I wrote an overview of the current state of VLA research with some personal takes: is.gd/1pqw9w
English
11
105
532
53.1K
Zhan Xianyuan रीट्वीट किया
Puneesh Deora
Puneesh Deora@puneeshdeora·
Global reaction to Overleaf going down
English
4
70
637
27.7K
Zhan Xianyuan रीट्वीट किया
Josh Clymer
Josh Clymer@joshua_clymer·
AI R&D automation is often cited as a risk, but what’s the threat model exactly? There are several! – and they are often conflated. I co-wrote a paper with 13 international experts that breaks down AI R&D threat models, ‘red line’ thresholds, and mitigations.
Josh Clymer tweet media
English
3
11
81
5.9K
Zhan Xianyuan
Zhan Xianyuan@atLargeIC·
Glad to share another great work from our lab: Diffusion Planner! The SOTA planning model for autonomous driving. Rule-based refinement may become history for learning-based autonomous driving models, thanks to the powerful diffusion models!
Zheng Yinan@ZhengYinan2001

🥳New SOTA on nuPlan! #ICLR2025 🚗Diffusion Planner leverages diffusion models with a specifically designed architecture for high-performance motion planning, reducing dependence on rule-based refinement. 🔗 zhengyinan-air.github.io/Diffusion-Plan…

English
0
0
2
164
Zhan Xianyuan रीट्वीट किया
Andrej Karpathy
Andrej Karpathy@karpathy·
TinyZero reproduction of R1-Zero "experience the Ahah moment yourself for < $30" Given a base model, the RL finetuning can be relatively very cheap and quite accessible.
Jiayi Pan@jiayi_pirate

We reproduced DeepSeek R1-Zero in the CountDown game, and it just works Through RL, the 3B base LM develops self-verification and search abilities all on its own You can experience the Ahah moment yourself for < $30 Code: github.com/Jiayi-Pan/Tiny… Here's what we learned 🧵

English
80
391
3.3K
386.8K
Zhan Xianyuan रीट्वीट किया
Haoyi Niu
Haoyi Niu@t641769919·
🥳Happy to announce H2O+ is accepted at #ICRA2025. Removing the over-conservatism and facilitating exploration in simulation, H2O+ is open to any Offline/Online RL design. Stay tuned for more updates! We will release our code in the following month. Happy Chinese New Year!🎊🏮🐍
Haoyi Niu@t641769919

📢Thrilled to have one paper accepted at #NeurIPS2022🥂with awesome teammates @HappyyPablo @EvieQ_01 advised by @atLargeIC💐Hope the Hybrid Offline-and-Online setting can shed new light on attractive solutions to the real-world deployment for RL. arxiv.org/abs/2206.13464

English
1
1
7
701
Zhan Xianyuan रीट्वीट किया
Vedang Vatsa FRSA
Vedang Vatsa FRSA@vedangvatsa·
🧵 Hidden Gems in DeepSeek-R1’s Paper
Vedang Vatsa FRSA tweet media
English
14
145
1K
611.5K
Zhan Xianyuan
Zhan Xianyuan@atLargeIC·
Introducing UniAct! One of my favorite works finished last year in my lab. UniAct enables learning universal actions to power ANY robotic embodiments, physical meaning, and control interfaces. The project page is at: 2toinf.github.io/UniAct/. Stay tuned for more exciting news!!!
Jinliang Zheng@2_toinf

#Robotic_Foundation_Models #Universal_Actions 0.5B beats 7B models🦾! We introduce a highly effective pretraining paradigm for embodied foundation model, powered by a Universal Action Space that boosts cross-embodiment data sharing and generalization! 2toinf.github.io/UniAct/

English
0
0
3
129
Zhan Xianyuan रीट्वीट किया
Zihan "Zenus" Wang ✈️ ICLR
[Long Tweet Ahead] I just have to say, I’m genuinely impressed by DeepSeek. 💡 It’s no wonder their reports are so elegant and fluffless. Here’s what I noticed about their culture, a space where real innovation thrives, during my time there ↓ — — — — — 🌟 1. Be nice and careful to talents - The recruiting teams seek top talent from China & globally. Many are PhD / grad / undergrads from Chinese top 10 universities e.g., Tsinghua / Peking University. - Hiring is minimalist: My interview took only a few rounds. They basically check two criteria: Do you genuinely WANT to push fundamental AI problems forward? CAN you make it happen (at least one standout skill + solid skills to get things done)? - Roles seem shaped around the talent, instead of vice versa. Not like “we need a role, so we find a talent”, they basically ask: “Here’s an exceptional talent; how can they contribute?” This can lead to something unconventional: they can hire someone with expertise in MBTI who finally focuses on creating more personalized / role-playing models. - Something basic: Top-tier benefits in China, including for interns, allowing them to concentrate on work matters and worry less about material concerns. 🤝 2. Individualized HR culture - With above talent-first hiring logistics, even with a 200-people scale, I still feel everyone is unique and there is no such thing like a standardization where everyone can be replaced like a cog-in-machine. - No pressure or forced KPIs. I hardly feel any sense like “this must be done by this Thursday” from my mentor / seniors / colleagues. - Being collaborative. DeepSeek tries its best to forbid race inside the company. It’s like everyone contributes to the final model with their own (orthogonal) ideas and everyone hopes their idea is useful. If an idea is proved useful, everyone celebrates, and everyone is happy about it. ⚙️ 3. Disentangled development systems - DeepSeek covers a highly diverse set of talent directions. It’s like how “expert specialization” happens in their MoE models. People focus on what they’re best at, and it’s natural to ask others things out of their expertise. Helping others with one's expertise is not what people only do after completing their own work. - There is a shared basic pipeline that works pretty well for everyone. When a group adds new things to the system, they do really good documentation so others can know what happens in a minute and how it affects their own roles (most of the time, this won’t affect their work; they just feel things improve automatically). - Feedback loops are FAST: To verify whether ideas could work, is basically just to test whether it could work on the super-latest simplified baseline. I strongly feel whenever I have an idea in the morning, I can realize whether it’s effective in the afternoon -- no organization approval, no hard GPU utilization restrictions, little debugging (thanks to the rigorously debugged baseline), just try to seamlessly add my own idea to the model. This makes working there super reflective and feedback-rich at the beginning of an idea, even if many ablations are required later to finally merge the idea to the giant model. So all of the above makes the organization super Spontaneous-person-friendly, and maybe this is why you can always trust their tech paths even when many improvements / ideas are applied in each single model release. I do appreciate such disentangled organization, which makes fast and solid iterations at different angles in the model. 4. 🌍Diversity sparks innovation It’s not really about something like “we must consider every party”. They pay attention to inclusion but it’s not the biggest matter. The biggest matter lies in “How can people from diverse backgrounds contribute to the DeepSeek model?” I have many colleagues called know-it-all “百晓生”, a role-of-talent that DeepSeek hires. As an AI company, it’s interesting to see so many AI developers just from literature / social science backgrounds. They know little about machine learning formulas and could understand model training based on their intuition of babysitting a child. It’s fun to discuss Zhenhuan Zhuan (a Chinese history drama) during lunch and do a lot of mind-practice like how to survive in a squid game. The initial idea of this role-of-talent is to build a global knowledge base on history, culture, and science to expand AGI capabilities. However, I do feel how they contribute to working efficiency / nurturing ideas of all the team, at least, making everyone happy and more focused when getting back to work from lunch. — — — — — Something random I hope to share at the end: It’s fun to solve some challenges to realize individual value or get a sense of achievement. In fact, it matters what “challenge” you are facing. The “challenge” here could just be “how to achieve AGI” – in such case, you actually do not need to worry too much about “what if this idea has been tried by someone else”, “what if someone achieves AGI faster than me”, “what if this idea is too simple” or “what if someone get paid more than me” – things many are indeed worried about. When what someone care is about achieving AGI, they could just try relentlessly about what is really useful and incorporate them into the model. — — — — — Resources and References: Two interviews with DeepSeek founder Liang Wenfeng: drive.google.com/file/d/1DW5ohZ… drive.google.com/file/d/1gLw9jp… chinatalk.media/p/deepseek-ceo… DeepSeek hiring ads: x.com/deepseek_ai/st… liepin.com/job/1959357241… And my experiences there.
English
23
199
1.4K
265.8K
Zhan Xianyuan रीट्वीट किया
Jianxiong Li@NeurIPS 2025
Jianxiong Li@NeurIPS 2025@Facebear_ljx·
I'll present RSP, and other 3 papers about Multimodal Instruction Masking, Robotics Representation Learning, and Decision Making Data Editting in NeurIPS 2024. See you in Vancouver!
Haoyi Niu@t641769919

We are working hard preparing the arXiv version—stay tuned for updates! For early access, catch RSP at NeurIPS'24 OWA Workshop poster session. Jianxiong @Facebear_ljx will be there, feel free to stop by and check it out!🧵(10/10) ⏰12.15 morning 📍East Building - MTG 1-3 + S.FOY

English
0
2
7
255
Zhan Xianyuan
Zhan Xianyuan@atLargeIC·
Offline RL models are becoming increasingly big nowadays, but are these heavy models truly necessary? Introducing our recent work RSP (AAAI 2025). We show that simply using shallow MLPs could work equally well or even better! The trick is to model it in a recursive way!
Haoyi Niu@t641769919

🚨Recursive Skip-Step Planning (RSP) Relying on larger, expressive models for sequential decision-making has recently become a popular choice, but are they truly necessary? Can we replace these heavy models? Yes—RSP empowers shallow MLPs to excel in long-horizon tasks!🧵(1/n)

English
0
1
6
227
Zhan Xianyuan रीट्वीट किया
Haoyi Niu
Haoyi Niu@t641769919·
Domains (environments/embodiments) where we train policies often differ from those where we deploy them, leading to transfer challenges arising from domain gaps.🥳Excited to announce our latest work: A Comprehensive Survey of Cross-Domain Policy Transfer for Embodied Agents🤖➡️🌏
Haoyi Niu tweet media
English
3
3
11
1.2K