Runjia Li

33 posts

Runjia Li banner
Runjia Li

Runjia Li

@RunjiaLi

DPhil in Computer Vision, University of Oxford || Clarendon Scholar @Oxford_VGG @OxfordTVG || Experience with @Snap @AIatMeta

Noxus Katılım Ekim 2022
174 Takip Edilen133 Takipçiler
Runjia Li retweetledi
Xuanchi Ren
Xuanchi Ren@xuanchi13·
We scaled up Lyra to generate explorable 3D worlds! 🚀 Introducing Lyra 2.0 — turning a single image into a 3D world you can walk through, look back, and even drop a robot into 🤖 Code and Model available today! 🌐 Website: research.nvidia.com/labs/sil/proje… (1/N)
English
29
124
879
1.1M
Runjia Li retweetledi
Alexander Pondaven
Alexander Pondaven@alexpondaven·
Introducing ActionParty: the first video world model that controls up to 7 players simultaneously on the same screen across 46 game environments. We tackle the action binding problem in video diffusion, ensuring each player's action is applied to the right subject. 🧵
Alexander Pondaven tweet media
English
6
10
51
9.1K
Runjia Li retweetledi
Junlin Han
Junlin Han@han_junlin·
Excited to share our new work: “Learning to See Before Seeing”! 🧠➡️👀 We investigate an interesting phenomeno: how do LLMs, trained only on text, learn about the visual world? Project page: junlinhan.github.io/projects/lsbs/
Junlin Han tweet media
English
7
26
159
25.6K
Runjia Li retweetledi
Chuanxia Zheng
Chuanxia Zheng@ChuanxiaZ·
After two amazing years with @Oxford_VGG, I will be joining @NTUsg as a Nanyang Assistant Professor in Fall 2025! I’ll be leading the Physical Vision Group (physicalvision.github.io) — and we're hiring for next year!🚀 If you're passionate about vision or AI, get in touch!
English
24
29
242
43.2K
Runjia Li retweetledi
AK
AK@_akhaliq·
VMem Consistent Interactive Video Scene Generation with Surfel-Indexed View Memory
English
7
56
373
83.5K
Runjia Li retweetledi
Zeren Jiang
Zeren Jiang@CodyJzr·
🎁 We present Geo4D, a method that repurposes a video diffusion model for monocular 4D reconstruction. Project page: geo4d.github.io Code repo: github.com/jzr99/Geo4D 𝐌𝐚𝐢𝐧 𝐂𝐨𝐧𝐭𝐫𝐢𝐛𝐮𝐭𝐢𝐨𝐧𝐬: ✨ A novel framework, Geo4D, to reconstruct the dynamic scene, which builds on top of an off-the-shelf video generator. ✨ A multi-modal geometric representation that helps the video diffusion model to learn consistent geometry during training. ✨ A lightweight multi-modal alignment that fuses partially redundant geometric modalities at test time for coherent and robust 4D reconstruction. ✨ Achieved SOTA performance on video depth estimation and comparable performance on camera pose estimation. Thanks to all co-authors for their invaluable support and contributions.  @ChuanxiaZ, Iro Laina, @dlarlus, Andrea Vedaldi
English
4
36
221
21.7K
Runjia Li retweetledi
Jianyuan
Jianyuan@jianyuan_wang·
Introducing VGGT (CVPR'25), a feedforward Transformer that directly infers all key 3D attributes from one, a few, or hundreds of images, in seconds! No expensive optimization needed, yet delivers SOTA results for: ✅ Camera Pose Estimation ✅ Multi-view Depth Estimation ✅ Dense Point Cloud Reconstruction ✅ Point Tracking Project Page: vgg-t.github.io Code & Weights: github.com/facebookresear…
English
21
192
1.3K
202.8K
Runjia Li retweetledi
Jensen Zhou
Jensen Zhou@jensenzhoujh·
Hi there, 🎉 We are thrilled to introduce Stable Virtual Camera, a generalist diffusion model designed to address the exciting challenge of Novel View Synthesis (NVS). With just one or a few images, it allows you to create a smooth trajectory video from any viewpoint you desire. We’re naming this model in tribute to the Virtual Camera cinematography technology. @StabilityAI 🏠 Project Page: stable-virtual-camera.github.io 📄 Paper: stable-virtual-camera.github.io/pdf/paper.pdf 📃 Blog: stability.ai/news/introduci… 💻 Code: github.com/Stability-AI/s… 🤗 Model Card: huggingface.co/stabilityai/st… 🚀 Gradio Demo: huggingface.co/spaces/stabili… 🎬 Video: youtube.com/channel/UCLLlV…
English
1
27
167
26.7K
Runjia Li retweetledi
Kalyan R
Kalyan R@kalyan_einstein·
New work with @lars__schaaf, @WillLin1028, @wanggrun, and @philiptorr: We optimize neural networks to smoothly represent minimum energy paths and predict transition states for chemical reactions. Compared to the traditional approach, our method shows (i) improved resilience to the initial guess, (ii) easy adaptability to escape local minima, (iii) the ability to capture a complex path on its own, and (iv) potential to generalize to unseen systems. This offers a flexible alternative to discrete methods that could unlock building a universal reaction path predictor. Our paper should appeal to anyone interested in machine learning to advance computational chemistry. Preprint: arxiv.org/abs/2502.15843.
GIF
English
1
7
17
1.3K
Runjia Li retweetledi
Sumeet Motwani
Sumeet Motwani@sumeetrm·
Introducing MALT: Improving Reasoning with Multi-Agent LLM Training🫡 We present a new multi-agent post-training method that uses credit assigned synthetic data to improve the reasoning capabilities and self-correction rates of a generator, critic, and refinement model working together🧵
Sumeet Motwani tweet media
English
13
51
313
68K
Runjia Li retweetledi
Richard Blythman
Richard Blythman@richardblythman·
MALT: Improving Reasoning with Multi-Agent LLM Training 🧠 🤖 Assigns specialized roles: Generator (ideas), Verifier (checks), Refiner (polishes). 🌱 Trains via trajectory-based synthetic data, improving reasoning at each role. 🔍 Introduces credit assignment for reasoning errors, enabling targeted learning. 🎯 Results on reasoning benchmarks: +14.14% (MATH) for complex problem-solving. +9.40% (CSQA) for commonsense reasoning. +7.12% (GSM8K) for grade-school math. 🔑 Advances in reasoning: Tackles reasoning bottlenecks by decomposing tasks into iterative steps. Enables models to learn not just from success but also from failure. Pushes collaborative problem-solving closer to human-like approaches. 📊 MALT’s structured multi-agent reasoning outshines single-agent setups, combining search, collaboration, and fine-tuning. Paper: arxiv.org/abs/2412.01928
English
0
1
2
204
Runjia Li retweetledi
Foundation Models in the Wild @ ICLR 2025
🚀Excited to announce the 2nd Workshop on Foundation Models in the Wild at @iclr_conf 2025 (April 27 or 28, Singapore). ‼️We welcome submissions! Please consider submitting your work here: fm-wild-community.github.io (deadline: Fed 3, 2025, AOE) 💠Topics: 🔸In-the-wild Adaptation of FMs 🔸Reasoning and planning abilities in real contexts 🔸Reliability and responsibility of various FMs 🔸Practical limitations of FMs in deployments 🔸Benchmarking FMs in real-world settings 🌟Speakers: @AnimaAnandkumar, @xinyun_chen_, @chelseabfinn, @tatsu_hashimoto, @rzshokri, @jietang, @tydsh, @vidal_rene 🧑‍🤝‍🧑Organizers: @Xinyu2ML, @HuaxiuYaoML, @mohitban47, @BeidiChen, @han_junlin, @Pavel_Izmailov, @peterljq, @PangWeiKoh, @WeijiaShi2, @qu_wenjie, @philiptorr, @zhaoywang_CS, @SonglinYang4, @LukeZettlemoyer, @jiahengzhang96 ✍️ Interested in joining our Program Committee to help review? Please fill out the nomination form (forms.gle/y2SWkfaegMgj15…)! 😃Hope to see you in Singapore or virtually in April, stay tuned for more info.
Foundation Models in the Wild @ ICLR 2025 tweet media
English
1
12
27
23.7K