Weijie Wang

61 posts

Weijie Wang

@wjwang2003

PhD student at ZIP Lab, Zhejiang University Research Intern @ ByteDance Seed | Microsoft Research

Zhejiang, China 가입일 Şubat 2024

104 팔로잉243 팔로워

고정된 트윗

Weijie Wang@wjwang2003·5d

🚀 Introducing World-R1: Video models already know 3D — they just need RL to wake it up! No arch changes. No video training data. No extra inference cost.⬇️ 🌐Website: aka.ms/world-r1

English

533

54.5K

Weijie Wang@wjwang2003·12h

@yourkaisensei You only need to design custom text prompts and trajectory generation strategies. We aim to offer a new perspective instead of merely a trained model.🥰

English

kai@yourkaisensei·5d

@wjwang2003 curious how you’re checking for reward hacking here. do the 3D gains hold on weird camera paths outside the training trajectory distribution?

English

213

Weijie Wang@wjwang2003·5d

🚀 Introducing World-R1: Video models already know 3D — they just need RL to wake it up! No arch changes. No video training data. No extra inference cost.⬇️ 🌐Website: aka.ms/world-r1

English

533

54.5K

Weijie Wang@wjwang2003·12h

@yourkaisensei Most camera gains come from latent injection. You can add randomization in training to adapt to diverse camera paths.

English

Weijie Wang@wjwang2003·13h

@MrManderly It’s just because of X’s video size limit. Not heavy computation. I’m not sure if upgrading helps🤣

English

MrManderly@MrManderly·5d

@wjwang2003 This is fantastic work. Can we infer that this is currently very expensive computationally from the low frame rate examples?

English

149

Weijie Wang@wjwang2003·1d

World-R1 has been accepted by @icmlconf , see you in Seoul! 📄 Paper: arxiv.org/abs/2604.24764 📷 Code: github.com/microsoft/Worl…

Weijie Wang@wjwang2003

🚀 Introducing World-R1: Video models already know 3D — they just need RL to wake it up! No arch changes. No video training data. No extra inference cost.⬇️ 🌐Website: aka.ms/world-r1

English

104

12.1K

Weijie Wang@wjwang2003·3d

@aswinrrv @AzmineWasi @icmlconf someone say they have another type

English

Aswin RRV@aswinrrv·3d

@AzmineWasi @icmlconf Can anyone confirm if everyone is seeing this, even with higher scores?

English

405

Azmine Wasi @ICML@AzmineWasi·3d

@icmlconf ICML Position Paper decisions seems out, indirectly 👀 Public-release or In-person presentation...?

English

2.5K

Weijie Wang@wjwang2003·3d

@AzmineWasi @icmlconf I have the same question

English

342

Weijie Wang@wjwang2003·4d

@taziku_co Thanks for sharing @taziku_co !

English

田中義弘 | taziku CEO / AI × Creative@taziku_co·5d

World-R1 Reinforcing 3D Constraints for Text-to-Video Generation microsoft.github.io/World-R1/ via:@wjwang2003

English

667

田中義弘 | taziku CEO / AI × Creative@taziku_co·5d

動画モデルは、最初から3Dを少し知っていた！？。 World-R1は、既存のText-to-Videoモデルに対して、追加推論コストなしで3D整合性へ寄せ、物体の永続性・幾何一貫性・カメラ制御を改善する方法。人間評価でも幾何一貫性92%と非常に高いスコアを記録詳細は🧵

日本語

1.4K

AK@_akhaliq·5d

Microsoft presents World-R1 Reinforcing 3D Constraints for Text-to-Video Generation paper: huggingface.co/papers/2604.24…

English

111

23.7K

Weijie Wang@wjwang2003·5d

@_akhaliq Thanks for sharing @_akhaliq! 🙏 🏠 Project: aka.ms/world-r1 💻 Code: github.com/microsoft/Worl…

English

315

Weijie Wang@wjwang2003·5d

@newlinedotco Totally agree on the inference tax concern! Good news: World-R1 adds zero overhead at inference. 3D foundation models and VLM critics only serve as reward signals during RL training. Once trained, it's the same architecture, same speed, just better 3D understanding built in ✅

English

375

$💥 \newline$

💥 \newline@newlinedotco·5d

@wjwang2003 video models knowing 3d is a massive unlock but the real hurdle for agents has always been the inference tax of running these heavy vision-to-action loops.

English

418

Weijie Wang 리트윗함

DailyPapers@HuggingPapers·5d

Microsoft just released World-R1 A framework that aligns text-to-video generation with 3D constraints through reinforcement learning, using feedback from pre-trained 3D foundation models to enforce structural coherence without altering the underlying architecture.

English

5.2K

Weijie Wang@wjwang2003·5d

@wjwang2003 , Xiaoxuan He, Youping Gu, @Yif_Yang, @SteveZeyuZhang , Yefei He, Yanbo Ding, Xirui Hu, @donydchen , Zhiyuan He, Yuqing Yang, @supremeZhuang

Indonesia

624

Weijie Wang@wjwang2003·5d

📄 Paper: huggingface.co/papers/2604.24… 🏠 Project: aka.ms/world-r1 💻 Code: github.com/microsoft/Worl…

English

835

Weijie Wang@wjwang2003·5d

🔑 How it works: • Embed camera trajectories into diffusion noise, zero extra modules • 3D rewards from Depth Anything 3 + Qwen3-VL as geometry critics • Periodic decoupled training: buildings stay rigid, flags still wave 🏗️🚩 • 3K text prompts only, no video data

English

1.4K

Zhenjun Zhao@zhenjun_zhao·16 Nis

Feed-Forward 3D Scene Modeling: A Problem-Driven Perspective @wjwang2003, Qihang Cao, Sensen Gao, @donydchen, @haofeixu, @wenjing_bian, @songyoupeng, Tat-Jen Cham, @ChuanxiaZ, Andreas Geiger, Jianfei Cai, Jia-Wang Bian, @supremeZhuang tl;dr: new survey arxiv.org/abs/2604.14025

English

3.1K

Weijie Wang@wjwang2003·16 Nis

@zhenjun_zhao @donydchen @haofeixu @wenjing_bian @songyoupeng @ChuanxiaZ @supremeZhuang Thanks @zhenjun_zhao ! Check out our paper list and project page here: 📷 Github: github.com/ziplab/Awesome… 🌐 ff3d-survey.github.io

English

Weijie Wang@wjwang2003·16 Nis

Authors: @wjwang2003 , Qihang Cao, Sensen Gao, @donydchen , @haofeixu , @wenjing_bian , @songyoupeng , Tat-Jen Cham, @ChuanxiaZ , Andreas Geiger, Jianfei Cai, @jiawangbian, @supremeZhuang

158

Weijie Wang@wjwang2003·16 Nis

📢 We release "Feed-Forward 3D Scene Modeling: A Problem-Driven Perspective" — a comprehensive survey covering 200+ papers on feed-forward 3D reconstruction! Instead of categorizing by 3D representations, we propose a problem-driven taxonomy. 🌐 ff3d-survey.github.io

English

114

7.8K

Weijie Wang@wjwang2003·16 Nis

A joint effort by ZJU, NTU, Monash, ETH Zurich & Uni Tübingen. 📄 Paper: huggingface.co/papers/2604.14… 💻 GitHub: github.com/ziplab/Awesome… Feedback & discussions welcome! 🙌

188

Weijie Wang@wjwang2003·16 Nis

6+ application areas (AD, robotics, SLAM, video gen…)

English

160

탐색

@yourkaisensei @MrManderly @icmlconf @aswinrrv @AzmineWasi @taziku_co @_akhaliq @newlinedotco