Yanshu Li✈️ICML2026

169 posts

Yanshu Li✈️ICML2026 banner
Yanshu Li✈️ICML2026

Yanshu Li✈️ICML2026

@karrsen0713

Incoming CS PhD @ UT | MSCS @ Brown | Multimodal LLMs & Agents

Providence, RI Katılım Ocak 2025
304 Takip Edilen365 Takipçiler
Sabitlenmiş Tweet
Yanshu Li✈️ICML2026
Yanshu Li✈️ICML2026@karrsen0713·
✨ Steering vectors are everywhere in today’s LLM field—but is subtracting activations really a “good” vector? We’d like to share two recent works where we rethink steering vectors / activation steering from first principles, and ask what actually makes steering generalizable and reliable 👇 📄 ICR Towards Generalizable Implicit In-Context Learning with Attention Routing arxiv.org/abs/2509.22854 📄 SVF Steering Vector Fields for Context-Aware Inference-Time Control in LLMs arxiv.org/abs/2602.01654
Yanshu Li✈️ICML2026 tweet media
English
2
5
17
5.4K
Wujiang Xu
Wujiang Xu@wujiang_ai·
Excited to join Meta MSL this summer as a Research Scientist Intern, working on multimodal agents! Before this, I’ve been exploring LLM agents from several angles: prompting-based memory systems, training agentic RL in long-horizon tasks, and evaluating memory in long-horizon agent tasks. If you’re in the Bay Area and working on related research or cool agent projects, I'd be happy to connect and chat!
English
2
1
10
673
FJ Han
FJ Han@feijianghan·
Finished my time at UPenn🏆 Two years ago I hadn’t published a single conference paper and honestly had no idea what I was doing. Crazy how much can change in two years.
FJ Han tweet media
English
2
0
11
1.1K
Kanishka Misra 🌊
Kanishka Misra 🌊@kanishkamisra·
Feedback welcome! This is rather short due to constraints that are out of my hand but happy to hear any thoughts/appreciations/comments/questions! osf.io/preprints/psya…
English
1
0
0
156
Kanishka Misra 🌊
Kanishka Misra 🌊@kanishkamisra·
New opinion piece on the interface between research on concepts and categories in minds vs. in neural network LMs! I take the position that there is much to be learned from this interface (e.g., learning about concepts from language alone) and outline some directions for future.
Kanishka Misra 🌊 tweet media
English
2
10
28
1.9K
Yanshu Li✈️ICML2026 retweetledi
Gabriele Berton
Gabriele Berton@gabriberton·
Cool paper from Meta suggesting that future MLLMs will be Native Multimodal Models (NMM), hence no vision encoders anymore But I disagree I actually think we'll go in the other direction (what? more encoders? yes! read on...) All you need to know about the future of MLLMs 🧵
Gabriele Berton tweet media
Weiming Ren@wmren993

1/ 🚀 We’re excited to share Tuna-2: Pixel Embeddings Beat Vision Encoders for Multimodal Understanding and Generation! Tuna-2 is a native unified multimodal model that supports visual understanding, text-to-image generation, and image editing directly from pixel embeddings. 🐟✨ 📄 Paper: arxiv.org/abs/2604.24763 🌐 Project: tuna-ai.org/tuna-2 💻 Code: github.com/facebookresear… Most unified multimodal models still rely on pretrained vision encoders, which add architectural complexity and can create representation mismatches between understanding and generation. Tuna-2 asks a simple question: Do we still need vision encoders? 👀 Our answer is No! Tuna-2 has a completely encoder-free architecture, where images are processed directly by a unified transformer together with text tokens. Take a glimpse at what our model can generate ↓ 🎨🖼️

English
9
24
187
58.3K
Yanshu Li✈️ICML2026 retweetledi
AAAI
AAAI@RealAAAI·
We are thrilled to present a detailed report describing the system built for the AAAI-26 AI review pilot, the survey results, and a new benchmark that was created to assess the capabilities of the system. Read the full article: arxiv.org/pdf/2604.13940
English
0
11
79
16K
Yanshu Li✈️ICML2026
Yanshu Li✈️ICML2026@karrsen0713·
AAAI already employs a two-stage review mechanism; however, during the first stage, reviewers are still required to provide their feedback as usual. Perhaps the next time you encounter such "bullshit," you could simply summarize your thoughts in a single sentence and submit that as your review.
English
0
0
0
509
Vlado Boza
Vlado Boza@bozavlado·
I think big AI conferences need a quick pre-review period. Basically, the goal of the reviewer would be just to say: a) Bullshit b) Not bullshit Right now, I am wasting a lot of my time writing "thoughtful" reviews for bullshit papers, which do not deserve more than a one-sentence review.
English
9
1
98
11.9K
Yanshu Li✈️ICML2026
Yanshu Li✈️ICML2026@karrsen0713·
@sheriyuo 实现批发的目的是?保研申博我认为还没有到需要批发的地步,满足毕业条件也是(对自己工作没信心想搞large-scale抽奖除外),那么就只有卷人才计划的实习工作了,但现在也逐渐讲究质量和match程度。难道单纯只是为了发文上赢过别人,觉得自己不在每个会议投大于等于1的paper就失败了吗
中文
1
0
6
793
Xiuyu Li
Xiuyu Li@sheriyuo·
AI/ML ccf-a 批发 “焚决”: 1. 进一个有 paper pipeline 的大组 / 跟一个 idea 多到要批发的 young AP 读他的前一两届学生 2. 记住 idea 一定要优先占坑,能早 arxiv 就早 arxiv 3. 你的目标是三小会(AAAI、ACL、MM),你的转投区间应该是 ACL -> EMNLP/MM -> AAAI -> ACL,注意 rebt 稍有不好就直接转投,这样你极限一年可以 rebt 四次,A 门是你永远的家 4. 凭借关系去蹭卡,做的方向要么新、要么别卷,不要做 CV,多找师兄师姐蹭共一第二/三 5. 脸皮一定要厚,要懂得炒作 6. 如果以上你在某一步断掉了,你可以去投 CHI 根据个人体质不同和资源不同,你能在 3 个月内水一篇烂大街的 ccfa 为合格条件 “焚决”纯代表个人意见,仅供参考
Kagurazaka Sora@MisakaMou

有个小朋友问我 我想说既然是“焚决”,不也是一步一步进化的嘛 ccfa怎么感觉跟大白菜一样 你说发top,哪怕是trans我也有经验啊 这个ccfa感觉隔行如隔山了 一年多碰个明白点老师加上投稿周期短一点我觉得也挺难 ccfa真的有这么好发嘛(小声逼逼 (有一阵子干的东西跟CS沾点边,发过trans,但真不懂ccfa)

中文
21
33
308
39.4K
Chandan Singh
Chandan Singh@csinva·
As ML conferences explode, one mitigation I'd like to see seriously considered is a cap on each author's yearly submissions (e.g. 3 total across NeurIPS, ICML, ICLR). Spamming low-quality submissions should have a real cost
Chandan Singh tweet media
English
3
2
219
16.9K
Xin Eric Wang (hiring postdoc)
Rejections are normal in academia, even for top researchers. To protect your mental health, practice 精神胜利法 ("spiritual victory method") with a deliberate double standard: ✅ Accepted: celebrate. you deserve it! ❌ Rejected: reviews are noisy; just bad luck. Try again.
English
18
16
398
33.3K
roife
roife@roifex·
看到有人开始用天坑专业形容计算机,感觉一阵恍惚
中文
479
15
754
152.4K
Yanshu Li✈️ICML2026
Yanshu Li✈️ICML2026@karrsen0713·
@Annikeroseling 能来得及改nips就还是投一下吧,一方面nips和aaai审稿这块差距没那么大,另一方面nips出分了还是可以转aaai
中文
0
0
0
279
anemos 🐰
anemos 🐰@Annikeroseling·
eccv出分了一篇有希望一篇直接报毙 但是想着eccv这么低的情况下下去nips和sigg asia也没啥用,就只能等八月份aaai 天呐为什么六月和七月没有会可以投?
中文
3
0
15
2.3K
Yanshu Li✈️ICML2026
Yanshu Li✈️ICML2026@karrsen0713·
Securing 5+ acceptances—with 0 to 1 of them as first author—is typically achieved through various collaborations (a trend that appears to be gaining momentum). If one manages to secure 2 to 3 first-authored papers, it usually implies that previously rejected submissions are fortunately accepted this time together. As for 4 to 5 first-authored papers, that would constitute an incredible outlier (something I have yet to see).
English
0
0
6
1.5K
Mathieu
Mathieu@miniapeur·
How do some PhD students manage to get 5+ ICML papers in the same year? I understand that collaboration plays a big role, but still. I’m pretty sure they aren’t submitting only to ICML either.
English
25
10
534
79.7K
shourya⚠️
shourya⚠️@flaczip·
it's not my loss it's ICML's loss actually
English
5
1
78
5.5K
Peng (Richard) Xia
Peng (Richard) Xia@richardxp888·
Package secured! 📦 Seriously though, who thought of this propeller hat? It’s like something straight out of a kindergarten playground. 😂 Ready to start my journey as a Student Researcher with @GoogleStudents. See you all soon! #GoogleInterns
Peng (Richard) Xia tweet media
English
1
0
19
801
Yanshu Li✈️ICML2026
Yanshu Li✈️ICML2026@karrsen0713·
Accepted at ICML 2026! Take Action for a Better RLVR!
MikaStars★@MikaStars39

Stop using LoRA for RLVR!!! New paper released👉Evaluating Parameter Efficient Methods for RLVR 📖Alphaxiv: alphaxiv.org/abs/2512.23165 💻Github: github.com/MikaStars39/Pe… Is standard LoRA truly the optimal choice for Reinforcement Learning?. We present the first large-scale evaluation of over 12 PEFT methodologies using the DeepSeek-R1-Distill family on complex mathematical reasoning benchmarks. Key Finding: Standard LoRA is suboptimal. Structural variants such as DoRA, AdaLoRA, and MiSS consistently outperform standard LoRA. Notably, DoRA (46.6% avg. accuracy) even surpasses full-parameter fine-tuning (44.9%) across multiple benchmarks. The failure of SVD-based initialization.  Strategies like PiSSA and MiLORA experience significant performance degradation or total training collapse. This is due to a fundamental "spectral misalignment": these methods force updates on principal components, while RLVR intrinsically operates in the off-principal regime. The Expressivity Floor.  While RLVR can tolerate moderate parameter reduction, extreme compression (e.g., VeRA, IA³, or Rank-1 adapters) creates an information bottleneck. Reasoning tasks require a minimum threshold of trainable capacity to successfully reorient policy circuits. Recommendations for the community: a. Move beyond the default adoption of standard LoRA. b. Prioritize geometry-aware adapters like DoRA that decouple magnitude and direction. c. Avoid SVD-informed initializations for RL tasks.

English
0
5
40
5.5K
paperpaper
paperpaper@paperpaper886·
@sheriyuo 见过有的工作啥也没改投了一年最后中AAAI的, 感觉这个会要完蛋了
中文
1
0
2
1K
Xiuyu Li
Xiuyu Li@sheriyuo·
For AI PhDs aiming for industry, paper count matters, but only up to a point. In China, 2 to 3 (co)first author CCF-A papers is often the borderline for a Top Talent offer. Beyond that, the marginal gain drops fast. When you apply as a fresh grad, what matters more is whether you have matched experience in a big tech foundation model team. As a PhD, papers can feel like a huge part of the world. After graduation, people see it differently. And for CS PhDs, AI and LLMs are only a small slice. Many groups do not even send students to industry internships the way LLM teams do, and industry itself is much bigger than LLMs. Paper is only one part of you. Your experience matters more. The LLM boom is a winner takes all arena shaped by extreme competition, where only the hardest driving survivors make it to the top. LLM 就是卷生卷死卷出来的幸存者盛世啊
English
5
10
266
130K
ICML Conference
ICML Conference@icmlconf·
So who's gonna set up the Polymarket for when ICML decisions are gonna drop? 👀📈⏳📉
English
13
10
243
62.1K