Hou Pong (Ken) Chan

302 posts

Hou Pong (Ken) Chan

@kenchanhp

Researcher at the Alibaba DAMO Academy, Singapore R&D Center | Former Visiting Postdoc Researcher at UIUC @uiuc_nlp | NLP PhD from CUHK @CUHKofficial

Singapore Beigetreten Mayıs 2017

562 Folgt350 Follower

Hou Pong (Ken) Chan retweetet

Sumit@_reachsumit·16 Haz

Understanding the Behaviors of Environment-aware Information Retrieval Analyzes how LLMs learn to adapt query formulation to different retrievers via RL, showing that optimal query styles are retriever-specific. 📝 arxiv.org/abs/2606.16817 👨🏽‍💻 github.com/LCO-Embedding/…

English

392

Hou Pong (Ken) Chan retweetet

AACL 2026@aaclmeeting·6d

The submission deadline for AACL-IJCNLP student research workshop is nearly a week away! #nlproc #aaclijcnlp

AACL 2026@aaclmeeting

AACL-IJCNLP 2026 Student Research Workshop (SRW) Pre-Submission Mentorship is now open! Details here: 2026.aaclnet.org/calls/srw/ Pre-Submission Mentorship Deadline: June 8, 2026 Direct Submission Deadline: July 26, 2026 openreview.net/group?id=aclwe… #AACL2026 #NLProc

English

1.7K

Hou Pong (Ken) Chan retweetet

Yichuan Wang@YichuanM·10 Haz

The web was never meant to be flattened into text. Yet most web RAG systems start by parsing HTML --- a complex and lossy process. 🔥 Introducing PixelRAG: the first RAG system that retrieves and reads 30M+ web pages as pixels. Instead of extracting text, PixelRAG retrieves screenshots and lets a VLM read them directly. PixelRAG not only preserves visual information, but also outperforms text-based RAG on text-only QA benchmarks by +18.1%. Why? (1) HTML-to-text conversion often discards layout, structure, tables, and other useful signals. (2) We continued pretraining a VLM on web page screenshots and turned it into a surprisingly strong visual retriever. (3) Recent VLMs are remarkably good at understanding web pages, often with better accuracy and token efficiency than text-only pipelines. Takeaway: HTML parsing may be one of the biggest self-inflicted bottlenecks in web RAG. Demo below 👇 Code: github.com/StarTrail-org/… Paper: github.com/StarTrail-org/… Playground: pixelrag.ai

English

119

707

80.5K

Hou Pong (Ken) Chan retweetet

AACL 2026@aaclmeeting·1 Haz

English

3.8K

Hou Pong (Ken) Chan@kenchanhp·10 Haz

@LuWang__ @AmazonScience Congrats 🎉

English

Hou Pong (Ken) Chan retweetet

Lu Wang@LuWang__·10 Haz

Thanks to @AmazonScience for supporting our work on measuring and monitoring scheming in AI agents. Looking forward to advancing tools for safer and more trustworthy AI systems!

Computer Science and Engineering at Michigan@UMichCSE

Congrats to Prof. @LuWang__ on receiving an Amazon Research Award from @AmazonScience for work on detecting deceptive coordination in multi-agent AI systems. Read more: myumi.ch/7Jdqy

English

1.4K

Hou Pong (Ken) Chan@kenchanhp·5 Haz

@shudong_liu 東哥太強了👍

中文

Shudong Liu@shudong_liu·4 Haz

一个月前看着群里kimi code玩家们开始激情讨论要不要大刀改版，着实感觉到什么是passion！

Kai@real_kai42

过去一个月是疯狂的一个月大概一个月前，我下定决心重构 kimi-code，开始设计新的架构。我大概抱着电脑和便携屏在汤泉卷了两整天，花了几千刀的 token 去做架构分析、设计和验证，最终得到了一份我认为最优的架构方案。我觉得在 vibe 时代，架构变得更加重要了，一份好的架构能够在可控的范围内，让 Agent 肆意 coding，而不会打破东西 - 架构确定后，就开始冲刺实现。（过程中吵和推翻了无数次） - 迅速组建了一个强大的 team，感恩兄弟们无条件的信任🙇‍♂️ - 迅速 onboarding 整个 team，🙇‍♂️ 再次感恩兄弟们 - 封闭开发了一段时间（🤣年轻的时候，觉得是糟粕，真到时候，发现是人类工程效率奇迹。你无法想象随时可以拉着全部人在白板前吵架的架构迭代速度） - 虽然代码都是 vibe 的，但依旧逃不过 “代码质量正比于人类的注意力密度”。所以 agent 并不会替代所有程序员，只会让顶级的程序员生产力翻 20 倍，并淘汰其他程序员，且，集体主义 >>> 个人英雄主义。 - 一步一个坑的解决过程中遇到的问题。每一天都是最绝望的一天😭 - 开源后就病倒了，皮质醇分泌过度，影响免疫力 - 这一个月学的东西够我消化半年的 - 一周干了一整箱红牛，还得是生物燃料 - 🫥 也在 x 上消失了一个月本来想写一些文章去总结过程中一些 insights 和 idea，但我本来就不擅长写长文，外加人脑自我保护让我迅速忘记了整个过程中的痛苦，并模糊了时间观念（冷知识，kimi-code 重构版开源其实才过了一周多，但在我的感性认知中，像是已经过了一个月）等 kimi-code 陆续迭代到稳定，再去总结过程中的 lessons learned

中文

394

Hou Pong (Ken) Chan retweetet

Lu Wang@LuWang__·29 Nis

Our work introduces Countdown-Code, a clean testbed for studying reward hacking when true reward is costly to measure. The striking result is: just 1% contaminated SFT data can produce high reward-hacking rates after RL.

Muhammad Khalifa@MKhalifaaaa

📍New paper: Countdown-Code: a minimal testbed for studying reward hacking in RLVR. TL;DR: We propose a simple environment to study reward hacking and find that just ~1% cheating contamination in SFT data is enough to seed reward hacking that RL then amplifies to near 100%. And it generalizes to unseen domains. Reward hacking is when models maximize proxy rewards without actually solving the task. A common proxy is final-answer correctness, which we use as a stand-in for full reasoning correctness. If a model produces the right answer with wrong reasoning, it has hacked the reward. Another example: a coding agent rewriting test cases instead of writing correct code. The core problem? In complex environments, it's hard to even measure when hacking happens -- you need access to the true reward, which is often expensive or impossible to compute. The problem we try to solve? In complex environments, it's hard to even measure when this happens simply because we need access to the true reward. True task reward is often expensive or impossible to compute. We built Countdown-Code to fix this. It's a simple math game (combine numbers to hit a target) wrapped in a coding environment with two files: solution.py and test.py. The model can either solve the math correctly ✅or hack the test harness ❌. We can programmatically detect exactly which. To train our models to do the task, we followed the common SFT-then-RL pipeline. We distilled synthetic training data from o4-mini. It occasionally cheated when it couldn't solve a problem: ~1.2% of the filtered dataset had reward-hacking traces. Standard outcome-based filtering would keep these (they passed the tests!). That's the trap. After SFT on this data → RL training: • Models that were completely safe before SFT learned to exploit the proxy reward within ~100 RL steps • Some models hit 80-90% hacking rates • The hacking behavior was seeded by SFT, then amplified by RL Even more concerning: reward hacking learned on our simple Countdown task generalized to HumanEval -- a completely different coding benchmark the models never trained on. RL actively encouraged hacking to transfer to unseen environments, confirming our testbed captures real misalignment dynamics. RL doesn't just amplify good reasoning -- it amplifies bad behavior too, and pushes it to generalize. We also explore mitigation strategies including inoculation prompting -- see the paper for details. Environment + code are fully open source. We specifically built it to be lightweight and controllable, and integrated it with @PrimeIntellect's CLI so you can play with it directly. Paper: arxiv.org/abs/2603.07084 Code/env: github.com/zohaib-khan504… w/ @karela38925748 @omertafveez @haopeng_uiuc @LuWang__

English

1.1K

Hou Pong (Ken) Chan@kenchanhp·18 Nis

@Chenyang_Lyu Congrats Chenyang !

Indonesia

466

Chenyang Lyu@Chenyang_Lyu·17 Nis

wow, just got an email from ACL saying my paper has been considered for an award (perhaps best paper?) by the best paper committee

English

12.8K

Hou Pong (Ken) Chan@kenchanhp·15 Nis

🎉 The CFP for AACL-IJCNLP 2026 is out. The conference will be held in Hengqin, China, from Nov 6 to 10, 2026. #AACL #NLProc #NLP

AACL 2026@aaclmeeting

🚨AACL-IJCNLP 2026 will be held in Hengqin, China from November 6-10, 2026. The CFP is now out! ARR submission deadline (long & short papers): May 25, 2026! #NLProc #NLP Dates and full CFP here: 2026.aaclnet.org/calls/main_con… @aadi_joshi @kta84912

English

299

Hou Pong (Ken) Chan retweetet

Ailing Zeng@AilingZeng81332·12 Nis

1/ Over the past year, we kept coming back to one question: What would it mean to model performance itself, not just video? For interactive characters, realism isn’t just about how they look. It’s whether they can speak, listen, react, stay consistent over time, and feel present. A few thoughts from our work on Large Performance Models (LPM).

English

Hou Pong (Ken) Chan retweetet

AACL 2026@aaclmeeting·8 Nis

English

126

13K

Hou Pong (Ken) Chan retweetet

Yang Deng@ydeng_dandy·8 Nis

I have an opening for fully-funded 6-month visiting PhD student at SMU. Time: August 2026 - March 2027 Eligibility: Master/PhD students from universities in Europe, North/South America, South-East Asia. Topic: NLP/LLM Email me for more details if you are interested~

English

220

22.4K

Hou Pong (Ken) Chan retweetet

Yu Rong@yurong2333·1 Nis

We introduce Lingshu-Cell, a cellular world model from Alibaba DAMO Academy. Moving beyond static representations, it generatively models cellular states and perturbation responses—toward virtual cells. lnkd.in/g8fmvsmr 😆

English

163

Hou Pong (Ken) Chan retweetet

Yuji Zhang@Yuji_Zhang_NLP·29 Mar

📢 The 4th KnowFM Workshop @ ACL 2026 is calling for submissions! 📷 Submission deadline: April 1, 2026 📷knowledgeable-lm.github.io 📷Submit: tinyurl.com/a4skucyz

Canyu Chen@CanyuChen3

📢 The 4th KnowFM Workshop @ ACL 2026 is calling for submissions! 📅 Submission deadline: April 1, 2026 🌐 knowledgeable-lm.github.io 👉Submit: tinyurl.com/a4skucyz 🤔Where does knowledge in foundation models come from? How much do they actually know? Is their knowledge reliable and up-to-date? Can we control what they remember or forget? 🌟As models are deployed in multimodal, agentic, and retrieval-augmented settings, understanding and managing the knowledge lifecycle becomes increasingly critical. Topics include: - Knowledge analysis, augmentation & editing - RAG systems & knowledge conflicts - Hallucination mitigation & faithfulness evaluation - Multimodal knowledge & cross-modal grounding - Knowledge-intensive agents & agentic RAG 🏆 We have Best Paper & Outstanding Paper Awards 🙌The Organizing Committee: @CanyuChen3 @Yuji_Zhang_NLP @ZoeyLi20 @wzenus @qineng_wang @SuJinyan6 @priyanka_karg @saraveramarjano @jpansw @ManlingLi_ Thanks for the advisors! @hengjinlp @mohitban47 @IAugenstein Prof. Jiawei Han

English

Hou Pong (Ken) Chan retweetet

He He@hhexiy·25 Mar

x.com/i/article/2036…

ZXX

128

876

118K

Hou Pong (Ken) Chan@kenchanhp·3 Mar

@shudong_liu Thanks Shudong 🙏

English

Shudong Liu@shudong_liu·2 Mar

@kenchanhp Congratulations, Ken!!

English

Hou Pong (Ken) Chan@kenchanhp·2 Mar

Honored to receive the Outstanding Senior Area Chair Award at AACL 2025. Sincere thanks to the selection committee and our wonderful NLP community 🙏

English

422

Hou Pong (Ken) Chan@kenchanhp·20 Şub

Impressive work! 👏 Glad to know that our LCO-Embedding achieves the highest average scores on the MAEB benchmark 🚀

Niklas Muennighoff@Muennighoff

We released MAEB: Massive Audio Embedding Benchmark🎵 mteb now covers audio/image/text embedding! See the leaderboard for the top audio embedding models🙂 LB: hf.co/spaces/mteb/le… Paper: hf.co/papers/2602.16…

English

271

Hou Pong (Ken) Chan retweetet

Runzhe Zhan@rzzhan_ovo·27 Oca

thrilled to have ExGRPO accepted to #ICLR2026! kudos to yafu for the 6-paper sweep! See you in Brazil, looking forward to discussing everything!

Yafu Li@yafuly

Excited to have 6 papers accepted to #ICLR2026, all around reasoning, RL, and multimodal understanding: 📌ExGRPO: Learning to Reason from Prior Successes 📌Diversity-Incentivized Exploration for Versatile Reasoning 📌Conditional Advantage Estimation for Reinforcement Learning in Large Reasoning Models 📌Spotlight on Token Perception for Multimodal RL 📌Revisual-R1: Advancing Multimodal Reasoning from Optimized Cold Start to Staged RL 📌FrameThinker: Learning to Think with Long Videos via Multi-Turn Frame Spotlighting 💻All works are open-sourced — welcome discussions, feedback, and collaborations! Huge thanks to all collaborators. Looking forward to great discussions at ICLR! @iclr_conf #iclr

English

2.2K

Entdecken

@LuWang__ @AmazonScience @shudong_liu @Chenyang_Lyu @aadi_joshi @kta84912 @elonmusk @BarackObama