
Leo Chenghui Li
67 posts

Leo Chenghui Li
@leo_chenghui
AI Research & Engineering @AIatMeta: World Models, Human AI, Multimodal LLMs | Ex-CMU Ex-CUHK



Introducing ABC: open data, training, and infrastructure for robotics. We release the largest teleop dataset to date, and extensively investigate design decisions, pretraining, and post-training techniques. @arthurallshire @Cinnabar233 @adamrasb @redstone_hong @davidrmcall



For the past years my research focus was on unifying models and training paradigms across modalities. Today I'm excited that we're releasing our latest model aligned with this theme: Gemma 4 12B, a dense encoder-free model which processes raw text, image, and audio inputs! 1/


In-context learning in LLMs





People talk, listen, watch, think, and collaborate at the same time, in real time. We've designed an AI that works with people the same way. We share our approach, early results, and a quick look at our model in action. thinkingmachines.ai/blog/interacti…


The team has been sweeping at local trivia night thanks to a model that's aware of continuous time.

🎆Beyond stoked to share some experiments of what we’ve been working on. It's been an absolute adventure building this (still early-stage) interaction model from ground up with the team, rethinking many components to enable a model that **interacts natively**. It sees the world, talks over users, searches, and generates artifacts. It grasps the dimension of time. No more explicit “scaffolding”/turn-taking (think about having a conversation via email?). The future is live⚡️, and consider joining us on this journey if you are excited about this too!

Will be presenting Self Forcing during today’s NeurIPS poster session at 4:30pm. On Saturday's NextVid workshop, I’ll also be giving a talk on video world models—covering the challenges outlined in my blog postand sharing latest research to address them. Looking forward to the discussions! xunhuang.me/blogs/world_mo…




