Qingcheng Zeng

194 posts

Qingcheng Zeng

@SteveZeng7

PhD-ing with @rfpvjr and @kaize0409 / IR, search agent, LLM, social computing / Big fan of @Arsenal / Christian

Evanston, IL Katılım Haziran 2019

2.7K Takip Edilen1.2K Takipçiler

Qingcheng Zeng retweetledi

Rohan Paul@rohanpaul_ai·3d

Researchers found that when language models face harder questions, their internal brain activity literally shrinks into fewer paths. Language models actually compress their internal thinking when they get confused, and we can use that to help them. Standard AI models usually spread their thinking across many artificial neurons when they confidently recognize familiar information. The team discovered that if you confuse a model with tricky math or conflicting facts, this broad activation collapses into a highly concentrated signal in its final processing layer. This shrinking happens because the system drops its robust distributed memory and forces the computation into a tiny specialized space to survive the unfamiliar challenge. The big deal is that we usually have no idea when a language model is actually struggling with a weird prompt until it gives a wrong answer. This paper proves that the model actually broadcasts its confusion internally by abandoning its wide neural networks and falling back on a very tiny cluster of active neurons. Because we can measure this exact shrinking effect as a raw number, we do not have to guess if a question is too hard for the AI. We can just read that internal signal and automatically provide the system with the perfectly scaled stepping stones it needs to solve the problem. ---- Paper Link – arxiv. org/abs/2603.03415 Paper Title: "Farther the Shift, Sparser the Representation: Analyzing OOD Mechanisms in LLMs"

English

100

16.8K

Qingcheng Zeng@SteveZeng7·11 Mar

@orionweller @antoine_chaffin That makes a lot of sense. I will keep you updated to see whether I can do something here!

English

Orion Weller@orionweller·11 Mar

Very cool results @SteveZeng7, thanks for sharing! I am a big fan of late interaction but in an instruction-following or prompting setting it’s always felt a bit of a mismatch to me. Late interaction is great because it has more fine grained token-matching while prompting is the opposite (eg avoid lexical matching). That could be because someone needs to update late interaction architectures to store more prompting-esque tokens though, so it could definitely be possible, just not obvious. Cool to see your initial start here 🤗

English

264

Qingcheng Zeng@SteveZeng7·11 Mar

In early Jan, I decided to write some blogs this year. Here comes the first one: qcznlp.github.io/blog/2026/inst…. I played with PyLate on instruction-following IR tasks and got some interesting results. Would really love to get feedbacks, especially from @antoine_chaffin if possible!

English

3.1K

Qingcheng Zeng@SteveZeng7·11 Mar

@antoine_chaffin Thanks @antoine_chaffin for the really helpful feedback! And I really expect the Qwen-Embed-based PyLate to come out. Now I will play with ColBERT-zero to see what will happen🫡

English

Antoine Chaffin@antoine_chaffin·11 Mar

I shared my feedback here, to try raising as much awareness about your work as possible! x.com/antoine_chaffi… The Tl;Dr is: I really appreciate it! I tried to do something very similar back then but did not take it to the finish line I think the results makes sense and I wonder how it would work with ColBERT-Zero, which is an even more ColBERT-tuned model, and also has a prompting mechanism (which can be double-edged sword in this case, as it's not the same kind of prompting) Also, I think to really have outstanding results, we might need to leverage big models/LLMs that have already some kind of prompting capabilities. I'm trying to get a Qwen-embed-based PyLate model out, it could be a good base!

Antoine Chaffin@antoine_chaffin

About the results now As the biggest fan of @orionweller on earth, I've immediately thought about this when they released Promptriever (actually, directly shot him a DM about it). Anytime people release dataset on the hub, it's a free opportunity to get SOTA by plugging it into PyLate So I went ahead and tried it (there is still a branch on the main repo for it, actually), even coded a in-training evaluator running with a PLAID index and coding p-MRR! So why did I not release it? Well, I am running this kind of experiments on a daily basis and, as illustrated in the blog post, the results are good, but not insanely good. I was in a phase of beating B-scale models by a large margin, so being better on some metrics but not all was a good result, but not good enough for me to spend time digging into it Essentially, I was scared that for prompting capabilities, you might require a LLM (or at least, starting from something that have shown some prompting capabilities), so I was waiting to train some larger scale ColBERT models to iterate on this (and I still believe we should see much better performance using those kind of models, it's already pretty cool to hit those results with such small models!) The work from @SteveZeng7 is a good reminder that sometimes sharing some cool results is enough and we should aim at sharing as much as we can, not just the perfect shiny results. I should share more about all the exploration I make! Finally, about the fact that it's better to start from GTE-ModernColBERT, I would say that it's somewhat related to our ColBERT-Zero study (huggingface.co/blog/lightonai…), in the sense that it's important to be careful about training the model for late interaction and not take as granted that just taking a dense model as a basis is optimal I am a bit surprised that the scale of this training is not enough, but as raised in the BP, I suppose it's because it's more related to learning "general" retrieval before! Actually, it would be pretty cool to try this boilerplate with the ColBERT-Zero models, I wonder how would be the results! The main issue to me is that those models already leverage a "prompt" and it might conflict a bit, but it's an interesting avenue!

English

323

Qingcheng Zeng@SteveZeng7·23 Oca

@antoine_chaffin Recently doing IR and looking into ColBERT, wondering why scaling ColBERT is so hard?

English

Antoine Chaffin@antoine_chaffin·21 Oca

Scaling ColBERT is very hard (trust me) However, when a whole cracked team start to work on it, it becomes possible And when you see what's possible with late interaction models (especially multi-modal ones), you know that search will be very different in the future Congrats!

Mixedbread@mixedbreadai

We build the first production ready multi-vector and multimodal search. Now we are serving over 1 billion documents in under 50ms latency (p50). We are sharing how we build it.

English

125

15K

Qingcheng Zeng retweetledi

Liu Jiayu@JiayuJeff·21 Oca

🚀 Introducing NAACL (Noise-AwAre Confidence CaLibration): a scalable, self-bootstrapping framework that trains LLMs to provide reliable and grounded verbal confidence in RAG scenarios. 📄 Paper: arxiv.org/abs/2601.11004 💻 Github: github.com/HKUST-KnowComp… #LLM #RAG #Alignment

English

875

Qingcheng Zeng@SteveZeng7·8 Oca

This paper is accepted at #EACL2026 main conference! Read it and let me know how you think!

Qingcheng Zeng@SteveZeng7

📢 New Preprint 📢 💪 Current LLMs are performing quite well in pragmatic reasoning 🧐 But how do they acquire this ability? Introducing AltPrag, a dataset motivated by the idea of "alternatives" in pragmatics to trace during which phase LLMs learn pragmatic reasoning. [1/n]

English

596

Qingcheng Zeng@SteveZeng7·19 Ara

@ChenhaoTan @universeinanegg not bad I guess lol

English

148

Chenhao Tan@ChenhaoTan·19 Ara

.@universeinanegg and I have spent quite some time debating hot takes. We often disagree, so I made a little game about it: chenhaot.com/ariorchenhao/ It is super simple. Every week we will each write three AI hot takes, and you can guess who wrote each and agree or disagree. We've found debating hot takes to be a good icebreaker maybe you will too!

English

3.7K

Qingcheng Zeng@SteveZeng7·16 Ara

@adveisner This is a high character limit lol

English

216

Jason Eisner@adveisner·16 Ara

LOR deadline day ... so many required fields! ... luckily a few of them have a high character limit

English

4.2K

Qingcheng Zeng retweetledi

rob voigt@rfpvjr·9 Ara

For everyone working on the intersections between linguistic and computational research, consider submitting to the upcoming SCiL! We're excited that it will be co-located with ACL 2026 as a workshop, and we've received NSF funding to help cover costs. sites.google.com/view/scil2026

English

Qingcheng Zeng retweetledi

Wanru Zhao@Renee42581826·4 Ara

Excited to present our work "Learning to Solve Complex Problems via Dataset Decomposition" at #NeurIPS2025! 🕟 Thu, Dec 4, 2025, 4:30 PM – 7:30 PM PST (🚨HAPPENING TODAY) 📍 Exhibit Hall C/D/E #3410 📅 Add to calendar: tinyurl.com/datadecomp 🧠 Training LLMs on randomly shuffled data is like asking elementary students to jump straight into calculus. We introduce Dataset Decomposition (Decomp): a method that recursively decomposes complex problems into an intelligent "easy-to-hard" curriculum, making small models smarter via structured reasoning. Sadly, I cannot attend in person due to visa delays 😢, but my incredible mentors from @MSFTResearch will be there! I'll be standing by remotely, so feel free to DM me or use the online chat! Huge thanks to my co-authors and mentors @LucasPCaccia, @Zhengyan_Shi, @kim__minseon, @weijiavxu, and @murefil. Deeply grateful to my supervisor @niclane7 for the generous support, to @ericxyuan and @Cote_Marc for the invaluable help, and to Colin Raffel, @MattVMacfarlane, @veds_12, @ZhihaoZhanMila, and @xiaoyin_chen66 for the stimulating discussions! @NeurIPSConf

English

1.8K

Qingcheng Zeng@SteveZeng7·8 Eki

@fredahshi Same for me😭

English

310

Freda Shi@fredahshi·8 Eki

I promise I'll submit something to the 3rd COLM (please do have it!) The papers look amazing every time, and I’ve already regretted missing out twice in a row.

English

6.4K

Qingcheng Zeng@SteveZeng7·6 Eki

I will not be at COLM, but my incredible mentee Kefan Yu @kefanyu0529 will be presenting this paper at COLM PragLM. He is applying for PhD this year. Talk to him!

Qingcheng Zeng@SteveZeng7

English

1.5K

Qingcheng Zeng@SteveZeng7·20 Ağu

This paper is accepted at #EMNLP25 Main conference! Check it out!

Qingcheng Zeng@SteveZeng7

Echoing the great work from @dongkeun_yoon, also share our updated preprint! 🧐 Do reasoning models verbalize their confidence better than instruct models? 🧐 🧐 Does RL provide additional benefits? 🧐 We explore this using a series of instruct and reasoning models... [1/n]

English

1.1K

Qingcheng Zeng retweetledi

2077AI@2077AI·7 Ağu

We're beyond thrilled to see our paper on VeriGUI now live on arXiv! 🥳 Check it out: huggingface.co/papers/2508.04… Welcome 👍UPVOTE and further discussion with our team in our HuggingFace community 💓 #VeriGUI #AIagents #LLM #Benchmark #AIresearch #OpenSource

English

1.4K

Qingcheng Zeng@SteveZeng7·31 Tem

@GeZhang86038849 @eliebakouch Curious why you say that multimodality understanding performance just doesn't matter?

English

Ge Zhang@GeZhang86038849·31 Tem

@eliebakouch They are like 3-4 months behind Kimi / dpsk team, only based on the text understanding perf. And like I said, I believe that the multimodality understanding performance just doesn't matter.

English

288

Ge Zhang@GeZhang86038849·31 Tem

stepfun.com/research/zh/st… Guess that they spend too much money on Scaling Law. I'm a fan for every team's amazing work. But this one is disappointing.

English

1.3K

Qingcheng Zeng@SteveZeng7·31 Tem

@dongxi_nlp arxiv.org/abs/2506.12928 arxiv.org/abs/2506.10055 这两个我觉得都蛮好

中文

1.5K

马东锡 NLP@dongxi_nlp·31 Tem

@SteveZeng7 我应该消除偏见去读读。除了OAgents，还有什么不错的工作么？

中文

2.7K

马东锡 NLP@dongxi_nlp·31 Tem

看到两篇Agent论文，论文title很吸引人。然后一看作者分别来自神州数码和OPPO，我的偏见让我瞬间不想看了。

中文

43.4K

Qingcheng Zeng retweetledi

Minghao Guo@MurphyKwok_·30 Tem

📢 Just dropped: DeepSieve RAG meets real-world heterogeneity. 🧠Modular RAG across SQL, JSON, Docs. 🧩 LLM Decompose → Route → Reflect → Fuse like an info sieve 📌 arXiv: 2507.22050 🧭 Project: minghokwok.github.io/deepsieve #LLM #RAG #AI #NLP #Rutgers

English

1.1K

Qingcheng Zeng@SteveZeng7·29 Tem

@NoSyu @aclmeeting Thank you for hosting!

English

JinYeong Bak@NoSyu·29 Tem

Thank you all for attending the virtual presentation session on Dialogue and Discourse @aclmeeting 2025

English

399

Keşfet

@orionweller @antoine_chaffin @ChenhaoTan @universeinanegg @adveisner @MSFTResearch @LucasPCaccia @Zhengyan_Shi