Runjia Li

33 posts

Runjia Li

@RunjiaLi

DPhil in Computer Vision, University of Oxford || Clarendon Scholar @Oxford_VGG @OxfordTVG || Experience with @Snap @AIatMeta

Noxus Katılım Ekim 2022

174 Takip Edilen133 Takipçiler

Runjia Li retweetledi

Xuanchi Ren@xuanchi13·15 Nis

We scaled up Lyra to generate explorable 3D worlds! 🚀 Introducing Lyra 2.0 — turning a single image into a 3D world you can walk through, look back, and even drop a robot into 🤖 Code and Model available today! 🌐 Website: research.nvidia.com/labs/sil/proje… (1/N)

English

124

879

1.1M

Runjia Li retweetledi

Alexander Pondaven@alexpondaven·3 Nis

Introducing ActionParty: the first video world model that controls up to 7 players simultaneously on the same screen across 46 game environments. We tackle the action binding problem in video diffusion, ensuring each player's action is applied to the right subject. 🧵

English

9.1K

Runjia Li retweetledi

Wei Yu@GnosisYu·1 Nis

Dropping an exciting new demo of MosaicMem! 👀🔥 A friend brought up a great question: why not combine long-horizon navigation video generation, promptable world events, and scene concatenation? Fair point — so we gave it a shot. 🎬✨ For more technical details, check this thread 🧵👇 x.com/GnosisYu/statu… #WorldModel #GenerativeAI #VideoGeneration #InteractiveAI #Genie3 #EmbodiedAI #GameAI

English

105

8.3K

Runjia Li@RunjiaLi·16 Mar

@Snapchat The work was done in a joint collaboration with @WilliMenapace during my internship @Snap. Many thanks to @moayedhajiali, @ashmrz10, Chaoyang Wang Arpit Sahni, @isskoro, Aliaksandr Siarohin, @JakabTomas, @han_junlin, @SergeyTulyakov, @philiptorr

English

358

Runjia Li@RunjiaLi·16 Mar

🎉EgoEdit @Snapchat has been accepted to CVPR 2026! 🏆👻 We are bringing high-quality, real-time editing to egocentric videos. Our massive 100k video dataset and benchmark are ALREADY PUBLIC! 🔓🚀 🏠 Project Page: snap-research.github.io/EgoEdit/ 🤗 Dataset: huggingface.co/datasets/ligua…

AK@_akhaliq

EgoEdit Dataset, Real-Time Streaming Model, and Benchmark for Egocentric Video Editing

English

106

21.6K

Runjia Li@RunjiaLi·16 Mar

@Snapchat Many thanks to coauthors! And thank @_akhaliq for posting our paper!

English

212

Runjia Li retweetledi

AK@_akhaliq·2 Mar

Mode Seeking meets Mean Seeking for Fast Long Video Generation paper: huggingface.co/papers/2602.24…

English

120

20.2K

Runjia Li retweetledi

AK@_akhaliq·23 Ara

WorldWarp Propagating 3D Geometry with Asynchronous Video Diffusion huggingface.co/papers/2512.19…

English

13.9K

Runjia Li retweetledi

Junlin Han@han_junlin·1 Eki

Excited to share our new work: “Learning to See Before Seeing”! 🧠➡️👀 We investigate an interesting phenomeno: how do LLMs, trained only on text, learn about the visual world? Project page: junlinhan.github.io/projects/lsbs/

English

159

25.6K

Runjia Li@RunjiaLi·26 Haz

🎉 VMem is officially accepted to ICCV 2025! Excited to chat with everyone in Hawaii about making video generation consistent and interactive with our Surfel-Indexed View Memory 🏝️🎥 Also, huge thanks to my insanely helpful coauthors!

AK@_akhaliq

VMem Consistent Interactive Video Scene Generation with Surfel-Indexed View Memory

English

15K

Runjia Li retweetledi

Tomas Jakab@JakabTomas·24 Haz

Excited to share VMem: a novel memory mechanism for consistent video scene generation 🎞️✨ VMem evolves its understanding of scene geometry to retrieve the most relevant past frames, enabling long-term consistency 🌐 v-mem.github.io 🤗 huggingface.co/spaces/liguang… 1/ 🧵

AK@_akhaliq

VMem Consistent Interactive Video Scene Generation with Surfel-Indexed View Memory

English

15.8K

Runjia Li retweetledi

Chuanxia Zheng@ChuanxiaZ·24 Haz

After two amazing years with @Oxford_VGG, I will be joining @NTUsg as a Nanyang Assistant Professor in Fall 2025! I’ll be leading the Physical Vision Group (physicalvision.github.io) — and we're hiring for next year!🚀 If you're passionate about vision or AI, get in touch!

English

242

43.2K

Runjia Li retweetledi

AK@_akhaliq·24 Haz

VMem Consistent Interactive Video Scene Generation with Surfel-Indexed View Memory

English

373

83.5K

Runjia Li retweetledi

Zeren Jiang@CodyJzr·15 Nis

🎁 We present Geo4D, a method that repurposes a video diffusion model for monocular 4D reconstruction. Project page: geo4d.github.io Code repo: github.com/jzr99/Geo4D 𝐌𝐚𝐢𝐧 𝐂𝐨𝐧𝐭𝐫𝐢𝐛𝐮𝐭𝐢𝐨𝐧𝐬: ✨ A novel framework, Geo4D, to reconstruct the dynamic scene, which builds on top of an off-the-shelf video generator. ✨ A multi-modal geometric representation that helps the video diffusion model to learn consistent geometry during training. ✨ A lightweight multi-modal alignment that fuses partially redundant geometric modalities at test time for coherent and robust 4D reconstruction. ✨ Achieved SOTA performance on video depth estimation and comparable performance on camera pose estimation. Thanks to all co-authors for their invaluable support and contributions. @ChuanxiaZ, Iro Laina, @dlarlus, Andrea Vedaldi

English

221

21.7K

Runjia Li retweetledi

Jianyuan@jianyuan_wang·17 Mar

Introducing VGGT (CVPR'25), a feedforward Transformer that directly infers all key 3D attributes from one, a few, or hundreds of images, in seconds! No expensive optimization needed, yet delivers SOTA results for: ✅ Camera Pose Estimation ✅ Multi-view Depth Estimation ✅ Dense Point Cloud Reconstruction ✅ Point Tracking Project Page: vgg-t.github.io Code & Weights: github.com/facebookresear…

English

192

1.3K

202.8K

Runjia Li retweetledi

Jensen Zhou@jensenzhoujh·18 Mar

Hi there, 🎉 We are thrilled to introduce Stable Virtual Camera, a generalist diffusion model designed to address the exciting challenge of Novel View Synthesis (NVS). With just one or a few images, it allows you to create a smooth trajectory video from any viewpoint you desire. We’re naming this model in tribute to the Virtual Camera cinematography technology. @StabilityAI 🏠 Project Page: stable-virtual-camera.github.io 📄 Paper: stable-virtual-camera.github.io/pdf/paper.pdf 📃 Blog: stability.ai/news/introduci… 💻 Code: github.com/Stability-AI/s… 🤗 Model Card: huggingface.co/stabilityai/st… 🚀 Gradio Demo: huggingface.co/spaces/stabili… 🎬 Video: youtube.com/channel/UCLLlV…

English

167

26.7K

Runjia Li retweetledi

Kalyan R@kalyan_einstein·7 Mar

New work with @lars__schaaf, @WillLin1028, @wanggrun, and @philiptorr: We optimize neural networks to smoothly represent minimum energy paths and predict transition states for chemical reactions. Compared to the traditional approach, our method shows (i) improved resilience to the initial guess, (ii) easy adaptability to escape local minima, (iii) the ability to capture a complex path on its own, and (iv) potential to generalize to unseen systems. This offers a flexible alternative to discrete methods that could unlock building a universal reaction path predictor. Our paper should appeal to anyone interested in machine learning to advance computational chemistry. Preprint: arxiv.org/abs/2502.15843.

GIF

English

1.3K

Runjia Li retweetledi

Sumeet Motwani@sumeetrm·6 Mar

Introducing MALT: Improving Reasoning with Multi-Agent LLM Training🫡 We present a new multi-agent post-training method that uses credit assigned synthetic data to improve the reasoning capabilities and self-correction rates of a generator, critic, and refinement model working together🧵

English

313

68K

Runjia Li retweetledi

Richard Blythman@richardblythman·28 Oca

MALT: Improving Reasoning with Multi-Agent LLM Training 🧠 🤖 Assigns specialized roles: Generator (ideas), Verifier (checks), Refiner (polishes). 🌱 Trains via trajectory-based synthetic data, improving reasoning at each role. 🔍 Introduces credit assignment for reasoning errors, enabling targeted learning. 🎯 Results on reasoning benchmarks: +14.14% (MATH) for complex problem-solving. +9.40% (CSQA) for commonsense reasoning. +7.12% (GSM8K) for grade-school math. 🔑 Advances in reasoning: Tackles reasoning bottlenecks by decomposing tasks into iterative steps. Enables models to learn not just from success but also from failure. Pushes collaborative problem-solving closer to human-like approaches. 📊 MALT’s structured multi-agent reasoning outshines single-agent setups, combining search, collaboration, and fine-tuning. Paper: arxiv.org/abs/2412.01928

English

204

Runjia Li retweetledi

Foundation Models in the Wild @ ICLR 2025@FM_in_Wild·13 Oca

🚀Excited to announce the 2nd Workshop on Foundation Models in the Wild at @iclr_conf 2025 (April 27 or 28, Singapore). ‼️We welcome submissions! Please consider submitting your work here: fm-wild-community.github.io (deadline: Fed 3, 2025, AOE) 💠Topics: 🔸In-the-wild Adaptation of FMs 🔸Reasoning and planning abilities in real contexts 🔸Reliability and responsibility of various FMs 🔸Practical limitations of FMs in deployments 🔸Benchmarking FMs in real-world settings 🌟Speakers: @AnimaAnandkumar, @xinyun_chen_, @chelseabfinn, @tatsu_hashimoto, @rzshokri, @jietang, @tydsh, @vidal_rene 🧑‍🤝‍🧑Organizers: @Xinyu2ML, @HuaxiuYaoML, @mohitban47, @BeidiChen, @han_junlin, @Pavel_Izmailov, @peterljq, @PangWeiKoh, @WeijiaShi2, @qu_wenjie, @philiptorr, @zhaoywang_CS, @SonglinYang4, @LukeZettlemoyer, @jiahengzhang96 ✍️ Interested in joining our Program Committee to help review? Please fill out the nomination form (forms.gle/y2SWkfaegMgj15…)! 😃Hope to see you in Singapore or virtually in April, stay tuned for more info.

Foundation Models in the Wild @ ICLR 2025 tweet media

English

23.7K

Keşfet

@Snapchat @WilliMenapace @Snap @moayedhajiali @ashmrz10 @isskoro @JakabTomas @han_junlin