Zeyuan Yang

@miiche_yang

UMass PhD | Current Intern @ Samsung | Previous @ THU

Katılım Mayıs 2022

106 Takip Edilen35 Takipçiler

Zeyuan Yang retweetledi

Yuncong Yang@YuncongYY·18 Tem

Thanks @_akhaliq for sharing our work! MindJourney fuses a world model with any VLM, so the model can first imagine walking around before it answers. From “one snapshot” to “what if I stand over there?”—and suddenly spatial reasoning hits SOTA. 🚀 Project Page: umass-embodied-agi.github.io/MindJourney/

AK@_akhaliq

MindJourney Test-Time Scaling with World Models for Spatial Reasoning

English

14.3K

Zeyuan Yang retweetledi

Martin Ziqiao Ma@ziqiao_ma·10 Tem

📣 Excited to announce SpaVLE: #NeurIPS2025 Workshop on Space in Vision, Language, and Embodied AI! 👉 …vision-language-embodied-ai.github.io 🦾Co-organized with an incredible team → @fredahshi · @maojiayuan · @DJiafei · @ManlingLi_ · David Hsu · @Kordjamshidi 🌌 Why Space & SpaVLE? We never directly “see” space. Instead, we reconstruct it, describe it through language, and navigate its constraints to plan our actions. Let’s bring together communities to tackle spatial intelligence across 2D/3D reasoning, grounded language, and real-world robotic planning. 📝 Call for Papers • 4-page shorts or 9-page fulls (non-archival) • Topics: spatial representation, grounding, datasets/benchmarks, foundation models & more. 🗓️ Key Dates • Deadline: Aug 22 • Notifications: Sep 22 • Camera-ready: Oct 25 Mark your calendars & start drafting! 🚀 🎤 Star-studded keynotes spanning CogSci, NLP, CV, Robotics. Amir Zadeh · Barbara Landau · Dieter Fox · Joshua Tenenbaum @MITCoCoSci · Joyce Chai @SLED_AI · Ranjay Krishna @ranjaykrishna · Saining Xie @sainingxie. Can’t wait for the insights! 🏆 Best Paper = $3 k cloud credits + Runner-up $1.5 k. Huge thanks to our sponsors: Lambda (@LambdaAPI), Alquist Robotics (@alquistrobotics), and EdenSign (@Edensign_ai). More sponsors welcome! DM us if you’d like to support spatial-AI research. Join us in San Diego to push the frontiers of spatial understanding and reasoning across CV, NLP, and robotics!

English

12K

Zeyuan Yang@miiche_yang·27 Haz

Thanks @_akhaliq for sharing our work! Step into the Mirage and explore multimodal reasoning!

AK@_akhaliq

Machine Mental Imagery Empower Multimodal Reasoning with Latent Visual Tokens

English

11.8K

Zeyuan Yang@miiche_yang·27 Haz

VLMs can think visually without generating pixels! I am thrilled to announce Mirage, a multimodal reasoning framework that interleaves latent visual representations among text tokens!

Chuang Gan@gan_chuang

VLM can think visually without generating pixels! VLM can think visually without generating pixels! VLM can think visually without generating pixels! 📢 We introduce Machine Mental Imagery (Mirage): a new framework that enables VLM to imagine using latent visual tokens—performing visual reasoning in latent space, no pixel rendering needed! We achieve this through a two-phase training paradigm: ✅ Stage 1: Grounding latent tokens in the visual subspace (joint supervision) ✅ Stage 2: Anchoring grounded tokens for generation (text-only supervision) Mirage demonstrates strong performance on a wide range of multimodal reasoning tasks! 📜Paper: arxiv.org/abs/2506.17218 🧑‍💻Code: github.com/UMass-Embodied… 📽️Project Page: vlm-mirage.github.io

English

834

Zeyuan Yang retweetledi

Chuang Gan@gan_chuang·20 Haz

World Simulator, reimagined — now alive with humans, robots, and their vibrant society unfolding in 3D real-world geospatial scenes across the globe! 🚀 One day soon, humans and robots will co-exist in the same world. To prepare, we must address: 1️⃣ How can robots cooperate or compete intelligently? 2️⃣ How do humans build social bonds and communities? 3️⃣ How can both co-exist in an open, dynamic world? Announcing Virtual Community Project — a social-physical world simulator, where human characters and robotic agents can interact, grow, and co-evolve within open-world societies, stretching from London to New York, and beyond! Key features include: ✅ Unified multi-agent physics simulations for rich social + physical interactions of humans and robots ✅ Massive auto-generated 3D scenes grounded with the rea-world geospatial data ✅ Agent communities populated by robots and LLM-driven human characters with rich appearances, personalities, and social ties. 🌍 Enter our Virtual Community, an open world to study embodied AI at scale— one social-physical world model at a time! 🔗 Project: virtual-community-ai.github.io 💻 Code: github.com/UMass-Embodied… Paper: virtual-community-ai.github.io/paper.pdf 1/n

English

271

88.2K

Keşfet

@_akhaliq @fredahshi @maojiayuan @DJiafei @ManlingLi_ @Kordjamshidi @MITCoCoSci @SLED_AI