Yuncong Yang (@YuncongYY) - Twitter Profili | Zamantika Mersobahis Locabet

Sabitlenmiş Tweet

Yuncong Yang@YuncongYY·21 Tem

Test-time scaling nailed code & math—next stop: the real 3D world. 🌍 MindJourney pairs any VLM with a video-diffusion World Model, letting it explore an imagined scene before answering. One frame becomes a tour—and the tour leads to new SOTA in spatial reasoning. 🚀 🧵1/

English

3

25

88

56.6K

Yuncong Yang@YuncongYY·11 Kas

Strongly agree with Prof. Fei-Fei Li on the importance of world models for spatial intelligence. Echoing that view, our NeurIPS’25 MindJourney is an early attempt to use world models for spatial reasoning—still at an early stage, but we hope it sparks discussion! Code: github.com/UMass-Embodied… #EmbodiedAI #SpatialIntelligence #AI

Fei-Fei Li@drfeifei

AI’s next frontier is Spatial Intelligence, a technology that will turn seeing into reasoning, perception into action, and imagination into creation. But what is it? Why does it matter? How do we build it? And how can we use it? Today, I want to share with you my thoughts on building and using world models to unlock spatial intelligence in this essay below. 1/n

English

0

7

1.4K

Yuncong Yang@YuncongYY·31 Eki

Thrilled to share MindJourney is accepted to #NeurIPS2025! MindJourney brings controllable world models to Embodied AI reasoning—letting agents “imagine” spatial rollouts for better spatial understanding. We updated our codebase and model weights recently: github.com/UMass-Embodied… #EmbodiedAI #SpatialIntelligence #AI

Yuncong Yang@YuncongYY

Test-time scaling nailed code & math—next stop: the real 3D world. 🌍 MindJourney pairs any VLM with a video-diffusion World Model, letting it explore an imagined scene before answering. One frame becomes a tour—and the tour leads to new SOTA in spatial reasoning. 🚀 🧵1/

English

0

10

1.1K

Yuncong Yang@YuncongYY·5 Eki

So proud of my friend @yitong_deng and the @moonlake team — building the next paradigm of computer graphics where anyone can create and interact with their own worlds. Excited to see what comes next! 👏🏻

Moonlake@moonlake

We raised $28M seed from Threshold Ventures, AIX Ventures, and NVentures (Nvidia's venture capital arm) —alongside 10+ unicorn founders and top AI researchers— to build reasoning models that generate real-time simulations and games. Models are bottlenecked by practical simulations that can act as Reinforcement Learning environments. Human self-expression is bounded by tools that let us create alternate realities. At Moonlake, we are building a future where anyone can create interactive worlds, bring their child-like wonder to life, learn within them, and most importantly, share experiences with people we care about. More in 🧵

English

0

5

561

Yuncong Yang@YuncongYY·25 Tem

Just paid ¥4.99 to a site that "predicts" NeurIPS acceptance from your ratings and confidence scores. Total scam-basically a random number generator. 🤡 I should build my own startup for this. Pretty sure I could make a fortune off researchers' anxiety these days. #NeurIPS2025

English

2

0

10

3K

Yuncong Yang retweetledi

Jianwei Yang@jw2yang4ai·23 Tem

VLM struggles badly to interpret 3D from 2D observations, but what if it has a good mental model about the world? Checkout our MindJourney - A test-time scaling for spatial reasoning in 3D world. Without any specific training, MindJourney imagines (acts mentally) step-by-step in the diffusion World Model to address the problem by gathering more "visual" information. This new way of combining VLMs and World Models, would significantly unblock the power of thinking in the space! Project: umass-embodied-agi.github.io/MindJourney/ Code: github.com/UMass-Embodied…

Yuncong Yang@YuncongYY

Test-time scaling nailed code & math—next stop: the real 3D world. 🌍 MindJourney pairs any VLM with a video-diffusion World Model, letting it explore an imagined scene before answering. One frame becomes a tour—and the tour leads to new SOTA in spatial reasoning. 🚀 🧵1/

English

0

3

32

3.3K

Yuncong Yang retweetledi

Chuang Gan@gan_chuang·22 Tem

Spatial reasoning from a single image is inherently difficult, but it becomes significantly easier when leveraging a controlled world model, analogous to the mental models used by humans! Code: github.com/UMass-Embodied…

Yuncong Yang@YuncongYY

Test-time scaling nailed code & math—next stop: the real 3D world. 🌍 MindJourney pairs any VLM with a video-diffusion World Model, letting it explore an imagined scene before answering. One frame becomes a tour—and the tour leads to new SOTA in spatial reasoning. 🚀 🧵1/

English

2

9

94

16.1K

Yuncong Yang@YuncongYY·21 Tem

See our project webpage, paper, and released code for more details! Project Page: umass-embodied-agi.github.io/MindJourney/ Github: github.com/UMass-Embodied… Thanks to all co-authors! @jiagengliu02 @zheyuanzhang99 @Siyuan_Zhou99 Reuben Tan @jw2yang4ai @du_yilun @gan_chuang also thanks @MSFTResearch for the support! #EmbodiedAI #SpatialIntelligence #3DAware #3DVision #AI

English

0

9

474

Yuncong Yang@YuncongYY·21 Tem

🎬 MindJourney in action Given a spatial reasoning question 1️⃣ Imagine – VLM and world model “walk” the scene iteratively 2️⃣ Observe – the VLM picks up the clues from the tour 3️⃣ Answer – with context, the VLM replies The imagination loop turns one frame into insight. 💡 🧵5/

English

1

0

6

520

Yuncong Yang@YuncongYY·21 Tem

Test-time scaling nailed code & math—next stop: the real 3D world. 🌍 MindJourney pairs any VLM with a video-diffusion World Model, letting it explore an imagined scene before answering. One frame becomes a tour—and the tour leads to new SOTA in spatial reasoning. 🚀 🧵1/

English

3

25

88

56.6K

Yuncong Yang retweetledi

Yilun Du@du_yilun·18 Tem

VLMs often struggle with physical reasoning tasks such as spatial reasoning. Excited to share how we can use world models + test-time search to zero-shot improve spatial reasoning in VLMs!

AK@_akhaliq

MindJourney Test-Time Scaling with World Models for Spatial Reasoning

English

3

24

186

25.1K

Yuncong Yang@YuncongYY·18 Tem

Thanks @_akhaliq for sharing our work! MindJourney fuses a world model with any VLM, so the model can first imagine walking around before it answers. From “one snapshot” to “what if I stand over there?”—and suddenly spatial reasoning hits SOTA. 🚀 Project Page: umass-embodied-agi.github.io/MindJourney/

AK@_akhaliq

MindJourney Test-Time Scaling with World Models for Spatial Reasoning

English

2

8

57

14.3K

Yuncong Yang retweetledi

AK@_akhaliq·16 Tem

You can install anycoder as a Progressive Web App on your device. Visit huggingface.co/spaces/akhaliq… and in the footer click settings then follow instructions and click the install button in the URL address bar of your browser

English

0

11

49

31K

Yuncong Yang retweetledi

Martin Ziqiao Ma@ziqiao_ma·10 Tem

📣 Excited to announce SpaVLE: #NeurIPS2025 Workshop on Space in Vision, Language, and Embodied AI! 👉 …vision-language-embodied-ai.github.io 🦾Co-organized with an incredible team → @fredahshi · @maojiayuan · @DJiafei · @ManlingLi_ · David Hsu · @Kordjamshidi 🌌 Why Space & SpaVLE? We never directly “see” space. Instead, we reconstruct it, describe it through language, and navigate its constraints to plan our actions. Let’s bring together communities to tackle spatial intelligence across 2D/3D reasoning, grounded language, and real-world robotic planning. 📝 Call for Papers • 4-page shorts or 9-page fulls (non-archival) • Topics: spatial representation, grounding, datasets/benchmarks, foundation models & more. 🗓️ Key Dates • Deadline: Aug 22 • Notifications: Sep 22 • Camera-ready: Oct 25 Mark your calendars & start drafting! 🚀 🎤 Star-studded keynotes spanning CogSci, NLP, CV, Robotics. Amir Zadeh · Barbara Landau · Dieter Fox · Joshua Tenenbaum @MITCoCoSci · Joyce Chai @SLED_AI · Ranjay Krishna @ranjaykrishna · Saining Xie @sainingxie. Can’t wait for the insights! 🏆 Best Paper = $3 k cloud credits + Runner-up $1.5 k. Huge thanks to our sponsors: Lambda (@LambdaAPI), Alquist Robotics (@alquistrobotics), and EdenSign (@Edensign_ai). More sponsors welcome! DM us if you’d like to support spatial-AI research. Join us in San Diego to push the frontiers of spatial understanding and reasoning across CV, NLP, and robotics!