Keming Wu

50 posts

Keming Wu

@Keming_Charles

PhD student @Tsinghua_Uni. Focus on Generative AI and VLM. Author of EditReward, OpenMMReasoner.

Bergabung Eylül 2025

518 Mengikuti142 Pengikut

Tweet Disematkan

Keming Wu@Keming_Charles·2 Eki

Why do open-source image editing models lag behind closed-source giants like GPT-Image-1, Seedream, & Google-Nano-Banana? 🤔 It’s mainly due to the quality of the training reward signal. We’re bridging the gap. Meet EditReward! 🏆

English

147

44.3K

Keming Wu me-retweet

Computer Vision and Pattern Recognition Papers@CSVisionPapers·1d

Visual Generation in the New Era: An Evolution from Atomic Mapping to Agentic World Modeling Keming Wu, Zuhao Yang, Kaichen Zhang, Shizun Wang, Haowei Zhu, Sicong Leng, Zhongyu Yang, Qijie Wang, … arxiv.org/abs/2604.28185 [𝚌𝚜.𝙲𝚅] 💬Project: github.com/EvolvingLMMs-L…

Computer Vision and Pattern Recognition Papers tweet media

Filipino

Keming Wu me-retweet

Kaichen Zhang@KaichenZhang358·1d

Excited to share a fun project I recently collaborated on: a roadmap for thinking about where visual generation is heading next. The key question is no longer just “can it make beautiful images?”, but whether it can handle memory, interaction, and eventually world modeling.

English

1.3K

Keming Wu@Keming_Charles·1d

Takeaway: The future is not just higher-fidelity images. It is controllable, interactive, verifiable, and world-aware generation. arXiv: arxiv.org/abs/2604.28185 HF Daily Paper: huggingface.co/papers/2604.28… GitHub: github.com/EvolvingLMMs-L… WebPage: evolvinglmms-lab.github.io/Evolving-Visua…

English

Keming Wu@Keming_Charles·1d

Benchmarks often reward visual quality. But real progress also needs spatial reasoning, topology, symbolic structure, and code/math-grounded correctness. We stress-test physical and causal reasoning: These examples probe the boundary between image synthesis and world modeling.

English

102

Keming Wu@Keming_Charles·1d

Excited to share our new roadmap: Visual Generation in the New Era: An Evolution from Atomic Mapping to Agentic World Modeling What does it mean for image generation models to become truly intelligent? HF Daily Paper: huggingface.co/papers/2604.28… WebPage: evolvinglmms-lab.github.io/Evolving-Visua…

English

3.8K

Keming Wu@Keming_Charles·20 Nis

An insightful and excellent piece of work.

Lianghui Zhu@lianghui_zhu

For a decade, we've made models wider and deeper—but we've barely changed how layers *talk* to each other. Since ResNet's `x + F(x)` in 2015, the depth residual has been the only highway for inter-layer communication. It's time to upgrade the staircase. 🧵

English

182

Keming Wu me-retweet

Jianyang Gao@gaoj0017·27 Mar

The TurboQuant paper (ICLR 2026) contains serious issues in how it describes RaBitQ, including incorrect technical claims and misleading theory/experiment comparisons. We flagged these issues to the authors before submission. They acknowledged them, but chose not to fix them. The paper was later accepted and widely promoted by Google, reaching tens of millions of views. We’re speaking up now because once a misleading narrative spreads, it becomes much harder to correct. We’ve written a public comment on openreview (openreview.net/forum?id=tO3AS…). We would greatly appreciate your attention and help in sharing it.

Google Research@GoogleResearch

Introducing TurboQuant: Our new compression algorithm that reduces LLM key-value cache memory by at least 6x and delivers up to 8x speedup, all with zero accuracy loss, redefining AI efficiency. Read the blog to learn how it achieves these results: goo.gle/4bsq2qI

English

976

6.5K

Keming Wu me-retweet

Brian Li@Brian_Bo_Li·10 Mar

We are building real intelligence into the real world. Amigogogogo.

AMI Labs@amilabs

Advanced Machine Intelligence (AMI) is building a new breed of AI systems that understand the world, have persistent memory, can reason and plan, and are controllable and safe. We’ve raised a $1.03B (~€890M) round from global investors who believe in our vision of universally intelligent systems centered on world models. This round is co-led by Cathay Innovation, Greycroft, Hiro Capital, HV Capital, and Bezos Expeditions, along with other investors and angels across the world. We are a growing team of researchers and builders, operating in Paris, New York, Montreal and Singapore from day one. Read more: amilabs.xyz AMI - Real world. Real intelligence.

English

362

36.4K

Keming Wu@Keming_Charles·23 Şub

@Sci_Tech_Eng Thanks!

English

Maryam@Sci_Tech_Eng·22 Şub

@Keming_Charles Congrats; talented! Wishing you more great milestones! 🦾

English

Keming Wu@Keming_Charles·22 Şub

Excited to announce that OpenMMReasoner is accepted to #CVPR2026 🚀 A complete open-source recipe for multimodal reasoning training! Website: evolvinglmms-lab.github.io/OpenMMReasoner/ Open-source: github.com/EvolvingLMMs-L…

Keming Wu@Keming_Charles

🚀 Excited to share OpenMMReasoner: A complete open-source recipe for multimodal reasoning training! 📊 874K SFT + 74K RL data 🔬 Reproducible SFT pipeline ⚡ Advanced RL training (GSPO/GRPO/DAPO) 📈 +11.6% over Qwen2.5-VL-7B baseline 🧵 Thread below 👇

English

5.4K

Keming Wu@Keming_Charles·22 Şub

Excited to announce that LongVT is accepted to #CVPR2026 🚀 Teaching Multimodal LLMs to "Actively Look Back" and understand long videos just like humans! Website: evolvinglmms-lab.github.io/LongVT/ Open-source: github.com/EvolvingLMMs-L…

Zuhao Yang@mwxely464

🔥 Introducing LongVT: Teaching Multimodal LLMs to "Actively Look Back" and understand long videos just like humans! We tackle the "sparse evidence" & "hallucination" issues in long-video reasoning with an end-to-end Agentic solution. Paper: arxiv.org/abs/2511.20785 More in thread

English

4.2K

Keming Wu@Keming_Charles·18 Şub

100+ citations 🎉 In a field that moves this fast, every citation feels like a small vote of confidence. Thanks to everyone who read, used, and built on our work.

English

165

Keming Wu@Keming_Charles·26 Oca

Excited to announce that EditReward is accepted to #ICLR2026 🚀 A state-of-the-art reward model that significantly improves alignment with human preferences for instruction-guided image editing. Website: tiger-ai-lab.github.io/EditReward/ Open-source: github.com/TIGER-AI-Lab/E…

Wenhu Chen@WenhuChen

Excited to announce that TIGER-Lab has 8 papers accepted to ICLR 2026. Congrats to all the students and co-authors!

English

547

Keming Wu me-retweet

Max Ku@vinesmsuic·17 Eki

😊😊😊

ComfyUI@ComfyUI

How do today's image models really perform when stress-tested on real-world tasks? Introducing ImagenWorld — a benchmark with explainable human evaluation revealing where and why models fail. - 6 generation & editing tasks × 6 visual domains - 20K+ human annotations with object-level issue tags - 14 models evaluated across artwork, photos, screenshots & more 🧵 👇

ART

1.7K

Keming Wu@Keming_Charles·17 Oca

@_TobiasLee Totally agree. Check out LongVT: evolvinglmms-lab.github.io/LongVT/ (eval via github.com/EvolvingLMMs-L…). For what’s next, agentic long-video reasoning looks promising. VideoSIAH targets segment-in-a-haystack, evidence-sparse long videos with tool-integrated reasoning.

English

312

Lei Li@_TobiasLee·15 Oca

MMMU-Pro and Video-MME are saturated now. We need some awesome agentic multimodal benchmarks. Any suggestions?

English

3.2K

Keming Wu@Keming_Charles·14 Oca

Don't start from scratch. Use our open-source dataset and baseline to push the boundaries of inherent editability! 👇 Code: github.com/redredsheep/Pr… Data: huggingface.co/artplus Paper: arxiv.org/abs/2505.22523 #DeepLearning #Dataset #DiffusionModels #AIart︎ #MultiLayer

English

Keming Wu@Keming_Charles·14 Oca

Highlights:✨ PrismLayersPro: 200K high-res transparent layers. ✨ ART+: Our model outperforms original ART in 60% of cases & rivals FLUX.1. ✨ LayerFLUX: Expert at accurate alpha mattes.

English

102

Keming Wu@Keming_Charles·14 Oca

Inspired by Qwen-Image-Layered? Want to train multi-layer models? 🛠️ Need data? Introducing PrismLayers: foundation for multi-layer gen community. Released earlier for researchers w/ robust baseline & data source.

English

201

Jelajahi

@Sci_Tech_Eng @_TobiasLee @elonmusk @BarackObama @taylorswift13 @cristiano @BillGates @NASA