Kangfu Mei

225 posts

Kangfu Mei banner
Kangfu Mei

Kangfu Mei

@KangfuM

Research Scientist at @GoogleDeepMind. I Omni. Veo. | Ph.D. @JohnsHopkins. | Tweets are my own.

Mountain View, California Katılım Temmuz 2022
407 Takip Edilen899 Takipçiler
Kangfu Mei retweetledi
Ethan Mollick
Ethan Mollick@emollick·
I think people don't realize why Gemini Omni is different than other video AIs. It is fully multimodal, so it can edit video natively, too I took the famous "train " movie from 1896 & made it a bullet train, LEGO, added a time traveler, a centipede, muppets... (see reflections?)
English
125
227
2.9K
262.2K
Vishal Patel
Vishal Patel@vishalm_patel·
Big day for the VIU Lab at Johns Hopkins! 🎓 We had an amazing batch of 7 PhD students get hooded today. Huge congratulations to our newest PhDs — proud of all your hard work, perseverance, and accomplishments. Excited to see all the great things you will achieve in industry, research labs, and beyond! #PhD #JHU #AI #ComputerVision @HopkinsDSAI @HopkinsEngineer @JHUECE
Vishal Patel tweet media
English
1
4
42
1.7K
Kangfu Mei
Kangfu Mei@KangfuM·
@CeyuanY Congratulations Ceyuan! That’s great work!
English
0
0
2
768
Ceyuan Yang
Ceyuan Yang@CeyuanY·
Introducing Omni, one unified model can support any-to-any multimodal modeling, including multimodal understanding, image/video generation and editing, world modeling and 3D reconstruction. All in one that adopts standard mixture-of-experts arch with only 3B activations.
English
9
27
221
30.7K
Kangfu Mei
Kangfu Mei@KangfuM·
@EvelynZ5699647 This is a great work. Any plan of applying this into video generative model as RL rewards?
English
2
0
0
370
Physion Labs Official
Physion Labs Official@Physion_Labs·
🚀🚀🚀We're excited to introduce Galileo 0 (lnkd.in/gJttjEr5) — our first research preview of a world critic for AI-generated video, which already outperforms Qwen 3.5-Plus, Gemini 3.1 Pro, Pegasus 1.2, and GPT 5.4 on physical consistency reasoning 🚀🚀🚀 Galileo doesn't just score outputs. It diagnoses them — identifying what failed, when it failed, where it happened, and why it broke the rules of the world. This is a step toward a new paradigm: generate → critique → refine → repeat — where models don't just produce worlds, but learn to keep them consistent over time. 𝐖𝐡𝐚𝐭 𝐦𝐚𝐤𝐞𝐬 𝐭𝐡𝐢𝐬 𝐦𝐢𝐥𝐞𝐬𝐭𝐨𝐧𝐞 𝐞𝐯𝐞𝐧 𝐦𝐨𝐫𝐞 𝐦𝐞𝐚𝐧𝐢𝐧𝐠𝐟𝐮𝐥: We built Galileo 0 — along with our datasets (including our public Physion-Eval benchmark), evaluation pipeline, and early pilots — with less than $200K total spend in 3 months. No massive training clusters. No billion-dollar budgets. Just a small, relentless team, strong conviction — and yes, at one point, a five-day stretch of not showering to get this model out. Because we believe reliability will become core infrastructure for world models. In a world where billions are being poured into generation, the missing piece isn't more pixels — it's better critics 😊 #PhysionLabs #Galileo0 #WorldModels
English
11
40
278
21.3K
Kangfu Mei retweetledi
Logan Kilpatrick
Logan Kilpatrick@OfficialLoganK·
Video’s here to stay - introducing Veo 3.1 Lite, our most cost efficient video generation model to date, and on April 7th we are also reducing the price for Veo 3.1 Fast : )
Logan Kilpatrick tweet media
English
195
155
2.7K
288.2K
Kangfu Mei retweetledi
Demis Hassabis
Demis Hassabis@demishassabis·
Ten years ago, AlphaGo’s legendary match in Seoul heralded the start of the modern era in AI. Its famous ‘Move 37’ signaled to us that AI techniques were ready to tackle real-world problems in areas like science - and ideas inspired by these methods are critical to building AGI
English
175
504
3.6K
715.6K
Jiaming Song
Jiaming Song@baaadas·
Excited to introduce Uni-1, our new *unified* multimodal model that does both understanding and generation: lumalabs.ai/uni-1 TLDR: I think Uni-1 @LumaLabsAI is > GPT Image 1.5 in many cases, and toe-to-toe with Nano Banana Pro/2. (showcase below)
Jiaming Song tweet media
English
29
53
413
97.1K
Kangfu Mei
Kangfu Mei@KangfuM·
🧞‍♂️
ART
1
0
1
217
Kangfu Mei
Kangfu Mei@KangfuM·
Happy to share that "Streaming Autoregressive Video Generation via Diagonal Distillation" is accepted by #ICLR26 ! See you all in Brazil 🇧🇷
Kangfu Mei tweet media
English
0
0
10
527
CuiMao
CuiMao@CuiMao·
大家好,这是我第4个fine-tuning的模型版本,以特朗普在达沃斯论坛的发言为主题做了重要播报 这个版本模型更新如下: 1:新增年轻美丽的CuiMao记者lora模型 2:新增CMTV片头节目包装 3:优化TTS效果播音员情感训练输出结果更加稳定 此次CuiMao同志模型的加入意味着CMTV完成了布局的最后一块拼图,下一个版本就是正式版,谢谢大家的支持。
CuiMao@CuiMao

大家好 这是我第三个fine-tuning后的朝鲜新闻播音员模型,目前各项指标已经基本稳定能用了 这个版本更新内容如下: 1:新增 主播表情的情绪控制 2:修复 北朝鲜语境下南朝鲜的ASR字幕模型无法识别转换的问题 3:修复 主播的手一直随意摆动的问题

中文
39
11
117
82.7K
Kangfu Mei
Kangfu Mei@KangfuM·
Everyone knows pretraining is powerful, but its potential in IR tasks beyond SFT or RL remains largely unexplored. In this work, we show that reusing pretraining (e.g., Flux) is easier than you think! 🧱 LEGO doesn't need pair-wise data and converts pretrained knowledge in an unsupervised (pseudo-supervised) way. Huge congrats to our intern @YuyangHu_666 for this great work! This is my final paper of 2025. Can't wait to show you what we have for an even more exciting 2026! 🚀
Peyman Milanfar@docmilanfar

LEGO: our post-training framework adapts diffusion models to unseen domains without paired ground truth. Using large-scale image generation models as reference we synthesize "oracle" training pairs LEGO creates sharp high quality results on real images where others fail 1/5

English
1
2
4
4.4K
Kangfu Mei
Kangfu Mei@KangfuM·
Interested in Veo 3.1 ? 🔬🔬🔬 From the Veo team @GoogleDeepMind , Sarah and I are presenting a live demo and Q&A at NeurIPS this year! Catch us at the Google Booth on Tues, Dec 2, from 2:00 - 2:30 PM. #NeurIPS2025 #Google
English
0
0
5
1.5K
Kangfu Mei
Kangfu Mei@KangfuM·
@viddivj Can I slip into the OpenAI's NeurIPS25 party ?🤡
English
0
0
4
458
Vidhi Jain
Vidhi Jain@viddivj·
I’m heading to NeurIPS in San Diego this year. Fun full-circle moment: back in 2022 I waited in the long lines and managed to slip into the OpenAI party with my grad school friends — and now I’m attending as part of OpenAI team!
Vidhi Jain tweet media
English
16
4
427
39.7K