Yuncong Yang

40 posts

Yuncong Yang

Yuncong Yang

@YuncongYY

Second-year CS PhD student at UMass Amherst, advised by @gan_chuang | Intern @MSFTResearch

Amherst, MA Katılım Aralık 2021
140 Takip Edilen324 Takipçiler
Sabitlenmiş Tweet
Yuncong Yang
Yuncong Yang@YuncongYY·
Test-time scaling nailed code & math—next stop: the real 3D world. 🌍 MindJourney pairs any VLM with a video-diffusion World Model, letting it explore an imagined scene before answering. One frame becomes a tour—and the tour leads to new SOTA in spatial reasoning. 🚀 🧵1/
English
3
25
88
56.6K
Yuncong Yang
Yuncong Yang@YuncongYY·
Strongly agree with Prof. Fei-Fei Li on the importance of world models for spatial intelligence. Echoing that view, our NeurIPS’25 MindJourney is an early attempt to use world models for spatial reasoning—still at an early stage, but we hope it sparks discussion! Code: github.com/UMass-Embodied… #EmbodiedAI #SpatialIntelligence #AI
Fei-Fei Li@drfeifei

AI’s next frontier is Spatial Intelligence, a technology that will turn seeing into reasoning, perception into action, and imagination into creation. But what is it? Why does it matter? How do we build it? And how can we use it? Today, I want to share with you my thoughts on building and using world models to unlock spatial intelligence in this essay below. 1/n

English
0
0
7
1.4K
Yuncong Yang
Yuncong Yang@YuncongYY·
Thrilled to share MindJourney is accepted to #NeurIPS2025! MindJourney brings controllable world models to Embodied AI reasoning—letting agents “imagine” spatial rollouts for better spatial understanding. We updated our codebase and model weights recently: github.com/UMass-Embodied… #EmbodiedAI #SpatialIntelligence #AI
Yuncong Yang@YuncongYY

Test-time scaling nailed code & math—next stop: the real 3D world. 🌍 MindJourney pairs any VLM with a video-diffusion World Model, letting it explore an imagined scene before answering. One frame becomes a tour—and the tour leads to new SOTA in spatial reasoning. 🚀 🧵1/

English
0
0
10
1.1K
Yuncong Yang
Yuncong Yang@YuncongYY·
Just paid ¥4.99 to a site that "predicts" NeurIPS acceptance from your ratings and confidence scores. Total scam-basically a random number generator. 🤡 I should build my own startup for this. Pretty sure I could make a fortune off researchers' anxiety these days. #NeurIPS2025
Yuncong Yang tweet media
English
2
0
10
3K
Yuncong Yang retweetledi
Jianwei Yang
Jianwei Yang@jw2yang4ai·
VLM struggles badly to interpret 3D from 2D observations, but what if it has a good mental model about the world? Checkout our MindJourney - A test-time scaling for spatial reasoning in 3D world. Without any specific training, MindJourney imagines (acts mentally) step-by-step in the diffusion World Model to address the problem by gathering more "visual" information. This new way of combining VLMs and World Models, would significantly unblock the power of thinking in the space! Project: umass-embodied-agi.github.io/MindJourney/ Code: github.com/UMass-Embodied…
Yuncong Yang@YuncongYY

Test-time scaling nailed code & math—next stop: the real 3D world. 🌍 MindJourney pairs any VLM with a video-diffusion World Model, letting it explore an imagined scene before answering. One frame becomes a tour—and the tour leads to new SOTA in spatial reasoning. 🚀 🧵1/

English
0
3
32
3.3K
Yuncong Yang retweetledi
Chuang Gan
Chuang Gan@gan_chuang·
Spatial reasoning from a single image is inherently difficult, but it becomes significantly easier when leveraging a controlled world model, analogous to the mental models used by humans! Code: github.com/UMass-Embodied…
Yuncong Yang@YuncongYY

Test-time scaling nailed code & math—next stop: the real 3D world. 🌍 MindJourney pairs any VLM with a video-diffusion World Model, letting it explore an imagined scene before answering. One frame becomes a tour—and the tour leads to new SOTA in spatial reasoning. 🚀 🧵1/

English
2
9
94
16.1K
Yuncong Yang
Yuncong Yang@YuncongYY·
🎬 MindJourney in action Given a spatial reasoning question 1️⃣ Imagine – VLM and world model “walk” the scene iteratively 2️⃣ Observe – the VLM picks up the clues from the tour 3️⃣ Answer – with context, the VLM replies The imagination loop turns one frame into insight. 💡 🧵5/
English
1
0
6
520
Yuncong Yang
Yuncong Yang@YuncongYY·
Test-time scaling nailed code & math—next stop: the real 3D world. 🌍 MindJourney pairs any VLM with a video-diffusion World Model, letting it explore an imagined scene before answering. One frame becomes a tour—and the tour leads to new SOTA in spatial reasoning. 🚀 🧵1/
English
3
25
88
56.6K
Yuncong Yang retweetledi
AK
AK@_akhaliq·
You can install anycoder as a Progressive Web App on your device. Visit huggingface.co/spaces/akhaliq… and in the footer click settings then follow instructions and click the install button in the URL address bar of your browser
AK tweet media
English
0
11
49
31K
Yuncong Yang retweetledi
Martin Ziqiao Ma
Martin Ziqiao Ma@ziqiao_ma·
📣 Excited to announce SpaVLE: #NeurIPS2025 Workshop on Space in Vision, Language, and Embodied AI! 👉 …vision-language-embodied-ai.github.io 🦾Co-organized with an incredible team → @fredahshi · @maojiayuan · @DJiafei · @ManlingLi_ · David Hsu · @Kordjamshidi 🌌 Why Space & SpaVLE? We never directly “see” space. Instead, we reconstruct it, describe it through language, and navigate its constraints to plan our actions. Let’s bring together communities to tackle spatial intelligence across 2D/3D reasoning, grounded language, and real-world robotic planning. 📝 Call for Papers • 4-page shorts or 9-page fulls (non-archival) • Topics: spatial representation, grounding, datasets/benchmarks, foundation models & more. 🗓️ Key Dates • Deadline: Aug 22 • Notifications: Sep 22 • Camera-ready: Oct 25 Mark your calendars & start drafting! 🚀 🎤 Star-studded keynotes spanning CogSci, NLP, CV, Robotics. Amir Zadeh · Barbara Landau · Dieter Fox · Joshua Tenenbaum @MITCoCoSci · Joyce Chai @SLED_AI · Ranjay Krishna @ranjaykrishna · Saining Xie @sainingxie. Can’t wait for the insights! 🏆 Best Paper = $3 k cloud credits + Runner-up $1.5 k. Huge thanks to our sponsors: Lambda (@LambdaAPI), Alquist Robotics (@alquistrobotics), and EdenSign (@Edensign_ai). More sponsors welcome! DM us if you’d like to support spatial-AI research. Join us in San Diego to push the frontiers of spatial understanding and reasoning across CV, NLP, and robotics!
Martin Ziqiao Ma tweet media
English
0
28
70
12K