Siyin Wang @ICLR 2026

28 posts

Siyin Wang @ICLR 2026

@wang_siyin

PhD Student at Fudan University #NLProc #LLM

Katılım Aralık 2022

158 Takip Edilen45 Takipçiler

Siyin Wang @ICLR 2026@wang_siyin·23 Nis

@xuningy how about the performance of recent WAMs like Dreamzero

English

Xuning Yang@xuningy·20 Nis

When every generalist robot model scores 95%+ on a benchmark, the numbers become meaningless. What if we built a photorealistic benchmark that never saturates and can generate new scenes and tasks with AI Workflows in minutes? We introduce RoboLab! 🧵(1/6)

English

144

26.8K

Siyin Wang @ICLR 2026@wang_siyin·23 Nis

Made it to Rio after nearly 30 hours in the air ✈️ Excited to be at #ICLR2026! We'll be sharing our work on RoboOmni on Apr 25 (10:30 AM – 1:00 PM). If you're interested in VLAs, Omni-LLMs, World Models, or Multimodal Agentic AI, feel free to reach out! #multimodality #VLA

English

388

Siyin Wang @ICLR 2026 retweetledi

机器之心 JIQIZHIXIN@jiqizhixin·17 Kas

What if robots could understand what you want without being told? RoboOmni makes that possible — an omni-modal LLM that fuses speech, sound, and vision to infer human intent, confirm actions, and execute tasks. Trained on the new OmniAction dataset (140k episodes), it outperforms text- and ASR-based baselines in success rate, speed, and proactive assistance, paving the way for more intuitive human-robot collaboration. RoboOmni: Proactive Robot Manipulation in Omni-modal Context Fudan, SII, NUS Paper: arxiv.org/abs/2510.23763 Code: github.com/OpenMOSS/RoboO… Project: OpenMOSS.github.io/RoboOmni Our report: mp.weixin.qq.com/s/PXBqdEW7_Ta_… 📬 #PapersAccepted by Jiqizhixin

English

3.8K

Siyin Wang @ICLR 2026@wang_siyin·4 Kas

I will be at EMNLP from Nov 5th to Nov 8th. If you are interested in multimodal spatial reasoning, Embodied AI (like VLA), Omni-LLMs, please feel free to chat with me!👋 📍I will also present a poster (ConvSearch-R1) with Changtai at Nov 6th 16:30-18:00. #EMNLP2025 #suzhou

English

372

Siyin Wang @ICLR 2026 retweetledi

DailyPapers@HuggingPapers·16 Eki

Unveiling the hidden fragilities of AI robotics New research introduces LIBERO-Plus, a comprehensive benchmark that systematically reveals vulnerabilities in Vision-Language-Action models under 7 real-world perturbation dimensions.

English

1.1K

Siyin Wang @ICLR 2026@wang_siyin·17 Eki

🚀Tired of Libero? Try our Libero-Plus! 🤔Libero’s at 99%, but we’ve found VLA drops points with even minor disturbances. 🤩Switch to Libero+ in just a few steps and unlock your VLA’s true generalization ability. #Embodied #VLA #Robotics

Senyu Fei@SenyuFei

🤯 Shocking findings from our new LIBERO-Plus benchmark for VLA robustness! 💡 Key Insight: High LIBERO scores ≠ strong models. 🔗 Paper: huggingface.co/papers/2510.13… 🌐 Page: sylvestf.github.io/LIBERO-plus 💻 Code: github.com/sylvestf/LIBER… ⭐ Star us & 🚀 upvote! #VLA #Robotics 1/8

English

656

Siyin Wang @ICLR 2026@wang_siyin·30 Tem

@feiliu_nlp @HuiWei15 congrats!

English

104

Fei Liu@feiliu_nlp·30 Tem

🏆 Thrilled that our paper #PlanGenLLMs (arxiv.org/abs/2502.11221) won the SAC Award at #ACL2025!! Couldn't have done it without the amazing team: @HuiWei15, Zihao Zhang, Shenghua He, Tian Xia, and Shijia Pan. So thankful and beyond proud! 💖 #ACL2025NLP #NLProc 🧠 Planning is a core aspect of both human and artificial intelligence. LLMs/agents have been used in various planning tasks, from navigating websites and planning trips to querying databases, but most benchmarks are narrow and task-specific. That makes it difficult to compare systems across domains, or figure out which one's best for a new planning problem. That's where our paper comes in: We offer a comprehensive overview of LLM-based planning agents, highlighting gaps, challenges, and what's next. Check it out 👉 arxiv.org/abs/2502.11221

English

Siyin Wang @ICLR 2026@wang_siyin·28 Tem

Excited to be in Vienna! Come by if you’re around!👋#ACL2025 📌Poster: VisuoThink (Thinking with images, Multimodal reasoning) 📌Poster: D2PO (World Modeling, Embodied) 🗓️Tue, July 29 16:00–17:30 |📍Hall 4/5 (Session 3) Presenting with @ngc7293q @JinlanFu !👏

Siyin Wang @ICLR 2026@wang_siyin

Thrilled to share our TWO papers accepted to #ACL2025 Main Conference! 🥳🎉 🎨VisuoThink: Empowering LVLM Reasoning with Multimodal Tree Search 🌏World Modeling Makes a Better Planner: Dual Preference Optimization for Embodied Task Planning #AI #MultimodalLearning #worldmodel

English

296

Siyin Wang @ICLR 2026@wang_siyin·7 Tem

@feiliu_nlp Congrats! Looking forward to seeing you!

English

Fei Liu@feiliu_nlp·6 Tem

Happy to share our paper got selected as an Oral Presentation at #ACL2025! Out of 8,000+ submissions and 3,000+ accepted papers, only 245 were chosen for oral (<3%)! Our amazing first author Hui Wei can't travel, so I'll be presenting in Vienna. Hope to see you there! 🇦🇹 📄 Paper: arxiv.org/abs/2502.11221 💻 Resource: github.com/wll199566/Awes… #ACL2025 #NLProc #PlanGenLLMs #LLM #AgenticAI

English

5.8K

Siyin Wang @ICLR 2026@wang_siyin·5 Tem

Awesome Open-sourced work! 🚀

Zhaoye Fei(ngc7293)@ngc7293q

🎙️ Welcome to try MOSS-TTSD~ When we first heard our AI voices naturally chatting and even interrupting each other, the shock was indescribable. This isn't cold TTS anymore - it's dialogue with real warmth. Try it online! huggingface.co/spaces/fnlp/MO…

English

215

Siyin Wang @ICLR 2026@wang_siyin·2 Tem

Exciting work on Embodied Agents! 🦾 By leveraging interactive reinforcement learning, we shatter the ceiling on ALFWorld (31.05 -> 97.78) and ScienceWorld (22.05 -> 79.92). 🔥 Huge thanks to amazing coauthors @ngc7293q li ji, junhao, @JingjingGong_ @xpqiu

Zhaoye Fei(ngc7293)@ngc7293q

🚀 New work: OpenMOSS Embodied Planner-R1 - A step toward AI self-improvement in interactive planning! We've developed an RL framework where LLMs learn to plan through autonomous environmental exploration - no human demonstrations needed. 🤖 🧵 Thread below 👇

English

228

Siyin Wang @ICLR 2026@wang_siyin·30 Haz

Thanks for sharing our latest work! 🚀 Grateful to work with amazing collaborators: junhao @ngc7293q @JingjingGong_ qipeng @xpqiu !

机器之心 JIQIZHIXIN@jiqizhixin

VLMs for embodied agents just got a major upgrade. Introducing World-Aware Planning Narrative Enhancement (WAP) — a framework that gives vision-language models true environmental understanding for complex, long-horizon tasks. Key upgrades: 🧠 Visual modeling 📐 Spatial reasoning 🔧 Functional abstraction 🗣️ Syntactic grounding

English

581

Siyin Wang @ICLR 2026@wang_siyin·17 May

1️⃣VisuoThink 🔗 arxiv.org/pdf/2504.09130 🛠️ github.com/ekonwang/Visuo… 2️⃣D²PO 🔗 arxiv.org/pdf/2503.10480 🛠️ github.com/sinwang20/D2PO Huge thanks to all my amazing collaborators for their brilliant insights! Excited to see these ideas contribute to the field of multimodal reasoning!

English

104

Siyin Wang @ICLR 2026@wang_siyin·17 May

English

769

Siyin Wang @ICLR 2026@wang_siyin·29 Nis

World Models are clearly booming, attracting researchers across CV, NLP, and Robotics! 🔥 We also presented our work at this workshop—thanks to everyone who stopped by! 🤩 #ICLR2025 #worldmodel

Christopher Manning@chrmanning

Learning about World Models, Understanding, Modeling and Scaling at @iclr_conf this morning proved not quite realistic! Shouldn’t the organizers have guessed this would be pretty popular in 2025?

English

219

Siyin Wang @ICLR 2026@wang_siyin·24 Nis

@furongh happy to chat with you today! I am also interested in VLA and world model. 🤩

English

Furong Huang@furongh·24 Nis

ZXX

345

Furong Huang@furongh·24 Nis

I will present TraceVLA today from 3 p.m. to 5:30 p.m. in Hall 3 + Hall 2B Poster #35. Find me there, and let's chat. #VLAs #Robotics #FoundationModel4Robotics

Yongyuan Liang@cheryyun_l

Introducing TraceVLA: a fully open-source Vision-Language-Action model reimagining spatial-temporal awareness: tracevla.github.io ✨ 3.5x gains on real robots, SOTA in simulation 💡 Fine-tunes on just 150K trajectories ⚡ Compact 4B model = 7B performance

English

2.3K

Siyin Wang @ICLR 2026@wang_siyin·14 Mar

📣 This work demonstrates that embedding world modeling into the planning leads to much better embodied intelligence. Check out the full paper for details, and great thanks to my talented collaborators @ngc7293q @cheng_qinyuan @Joey_zh_ @PanpanCai @JinlanFu @xpqiu 8/8

English

104

Siyin Wang @ICLR 2026@wang_siyin·14 Mar

🔎 We also compared action-conditioned (predict state given state & action) vs. goal-directed world models (imagines future state from history & goal). While action-conditioned excel in familiar settings, goal-directed models generalize better to novel environments! 7/8

English

Siyin Wang @ICLR 2026@wang_siyin·14 Mar

✨ Excited to share our latest research “World Modeling Makes a Better Planner: Dual Preference Optimization for Embodied Task Planning” 🤔 Current LVLMs struggle with grounding in embodied environments, how can we make AI agents understand the physical world like humans? 1/8

English

435

Keşfet

@xuningy @feiliu_nlp @HuiWei15 @ngc7293q @JinlanFu @JingjingGong_ @xpqiu @furongh