WeiChen

29 posts

WeiChen

WeiChen

@wei_chen_ai

Phd @SCUT1918, intern @RIKEN_AIP_EN. Probabilistic modeling & generations, post-training, including their applications to trustworthy and safe AI

Katılım Temmuz 2024
183 Takip Edilen48 Takipçiler
Sabitlenmiş Tweet
WeiChen
WeiChen@wei_chen_ai·
🎉 Our paper about Preference Optimization has been accepted to ICML 2026! We unify entangled & disentangled objectives via incentive–score decomposition, derive the Disentanglement Band for ideal training dynamics: suppress loser while preserving winner. #ICML2026
WeiChen tweet media
English
4
2
45
3.4K
Yi (Joshua) Ren
Yi (Joshua) Ren@JoshuaRenyi·
@HuggingPapers Aha, very cool work. I believe the analysis framework provided by the following paper can also support the findings here: for those semantically similar sub-sequences, updating one would influence the confidence of another more. x.com/JoshuaRenyi/st…
Yi (Joshua) Ren@JoshuaRenyi

📢Curious why your LLM behaves strangely after long SFT or DPO? We offer a fresh perspective—consider doing a "force analysis" on your model’s behavior. Check out our #ICLR2025 Oral paper: Learning Dynamics of LLM Finetuning! (0/12)

English
1
0
1
154
DailyPapers
DailyPapers@HuggingPapers·
Fine-tuning increases hallucinations New research shows SFT causes factual errors by interfering with pre-trained knowledge. The authors propose self-distillation to learn new facts without forgetting, plus selective parameter freezing to reduce hallucinations while preserving performance.
DailyPapers tweet media
English
4
35
165
9.3K
WeiChen
WeiChen@wei_chen_ai·
@HuggingPapers Interesting! Our work introduces the disentanglement band: a conceptual tool for analyzing how preference updates interfere with the winner vs. loser responses. It helps diagnose how suppressing the loser may harm the winner. Also inspired by @JoshuaRenyi. x.com/i/status/20504…
WeiChen@wei_chen_ai

🎉 Our paper about Preference Optimization has been accepted to ICML 2026! We unify entangled & disentangled objectives via incentive–score decomposition, derive the Disentanglement Band for ideal training dynamics: suppress loser while preserving winner. #ICML2026

English
0
0
0
51
WeiChen
WeiChen@wei_chen_ai·
🎉 Our paper about Preference Optimization has been accepted to ICML 2026! We unify entangled & disentangled objectives via incentive–score decomposition, derive the Disentanglement Band for ideal training dynamics: suppress loser while preserving winner. #ICML2026
WeiChen tweet media
English
4
2
45
3.4K
WeiChen
WeiChen@wei_chen_ai·
@daniel_sc4 Interesting! I recently also work on token-level uncertainty for clarifying easy/hard tasks. 🙋‍♂️
English
0
0
1
27
Daniel Scalena
Daniel Scalena@daniel_sc4·
You can easily save up to 65% of compute while improving performance on reasoning tasks 🤯 👀 Meet EAGer: We show that monitoring token-level uncertainty lets LLMs allocate compute dynamically - spending MORE on hard problems, LESS on easy ones. 🧵👇
Daniel Scalena tweet media
English
2
5
25
5.4K
Aakash
Aakash@shinyzenith72·
Our work has been accepted at ICML 2026 main track! arxiv.org/abs/2511.02623 If you are interested in alignment and specifically realignment then do check out our work on resource constraint editing of LLM values! @murari_ai @debdeeplikesai
Aakash tweet media
English
3
7
14
1.8K
WeiChen
WeiChen@wei_chen_ai·
@HadyHaji seems interesting view of preference optimization, and I recently work on a similar idea. Could you please share an link of this paper to me?
English
0
0
0
5
عەبدولهادی عەباس عەبدوڵا
Exciting News! 🚀 Our paper is accepted at ICML 2026! We are thrilled to share that our research, "TUR-DPO: Topology- and Uncertainty-Aware Direct Preference Optimization," has been accepted at the International Conference on Machine Learning (ICML).
عەبدولهادی عەباس عەبدوڵا tweet mediaعەبدولهادی عەباس عەبدوڵا tweet media
English
1
0
1
219
WeiChen retweetledi
Molei Tao
Molei Tao@MoleiTaoMath·
Plz consider submitting high quality works to ICML 2026 Workshop on Foundations of Deep Generative Models, and interact with the cool community in the summer of vibrant Seoul, South Korea! fdgm-workshop.github.io/FDGM_ICML2026/ Submit at #tab-your-consoles" target="_blank" rel="nofollow noopener">openreview.net/group?id=ICML.… by 4/30.
Molei Tao tweet media
English
0
6
45
6.8K
Yoshitomo Matsubara
Yoshitomo Matsubara@yoshitomo_cs·
From my experience including #ICML2026 as an AC and an author, I want to suggest: 1. ACs rate reviews BEFORE review release 2. Make the ratings VISIBLE to authors Authors can see signals of AC's engagement and better prepare rebuttals More work on ACs, but we can mitigate that
Yoshitomo Matsubara@yoshitomo_cs

TIL the importance of diligent ACs in the peer-review process (again), unfortunately as an author #ICML2026 No response to our repeated appeals against reviewer policy/guideline violation On top of that, the AC did the same violation as the reviewer I flagged What's next?

English
1
1
32
6.1K
WeiChen
WeiChen@wei_chen_ai·
@stjohn2007 Oh, I have read this paper one week ago. A nice and impressive work! (in my opinion, this paper is also related to the density-chasm problem in density ratio estimation.)
English
0
0
0
114
Masanari Oi
Masanari Oi@stjohn2007·
We propose Autoregressive Direct Preference Optimization (ADPO), a new formulation of DPO that explicitly incorporates autoregressive modeling. ADPO revisits the foundations of DPO and leads to a more principled objective. 📚️arxiv.org/pdf/2602.09533
Masanari Oi tweet media
Masanari Oi@stjohn2007

Two first-author papers accepted to #ICML2026 🇰🇷 ! - Human-like multi-image spatial reasoning in multimodal LLMs (@silviasetitech @sponddd @dai0NLP Prof. Inoue @chokkanorg) - Autoregressive direct preference optimization (Mahiro Ukai @MasahiroKaneko_ @chokkanorg Prof. Inoue)

English
1
10
74
12.1K
Akari Asai
Akari Asai@AkariAsai·
2 papers accepted to ICML as Spotlights (top 2.2%)🥳 - DR Tulu: RL w/ evolving rubrics for SOTA long-form deep research arxiv.org/abs/2511.19399 - Binary RAR: RL w/ binary rewards for the hallucination–capability trade-off arxiv.org/abs/2510.17733 Congrats to all collaborators!
Akari Asai tweet mediaAkari Asai tweet media
English
7
18
232
11.3K
WeiChen
WeiChen@wei_chen_ai·
Entangled: chosen & rejected rewards are coupled (e.g., DPO). Disentangled: they update independently (e.g., DIL). Our work unifies both.
English
0
1
3
190
Runxin Xu
Runxin Xu@pigjunebaba·
We keep striving to build things that bring long-term value to everyone. We hope you enjoy our latest model — try it now on web, app, and API 🚀 「不诱于誉,不恐于诽,率道而行,端然正己」
DeepSeek@deepseek_ai

🚀 DeepSeek-V4 Preview is officially live & open-sourced! Welcome to the era of cost-effective 1M context length. 🔹 DeepSeek-V4-Pro: 1.6T total / 49B active params. Performance rivaling the world's top closed-source models. 🔹 DeepSeek-V4-Flash: 284B total / 13B active params. Your fast, efficient, and economical choice. Try it now at chat.deepseek.com via Expert Mode / Instant Mode. API is updated & available today! 📄 Tech Report: huggingface.co/deepseek-ai/De… 🤗 Open Weights: huggingface.co/collections/de… 1/n

中文
9
20
352
18.7K
WeiChen
WeiChen@wei_chen_ai·
Score-based methods: theory says the path doesn't matter, but practice says it does. We found why — path variance — and learned the optimal interpolation path in closed form. No heuristics, just math. #ICLR #ICLR2026 #Rio
WeiChen tweet media
English
2
0
3
133
WeiChen retweetledi
Kimi.ai
Kimi.ai@Kimi_Moonshot·
Meet Kimi K2.6: Advancing Open-Source Coding 🔹Open-source SOTA on HLE w/ tools (54.0), SWE-Bench Pro (58.6), SWE-bench Multilingual (76.7), BrowseComp (83.2), Toolathlon (50.0), Charxiv w/ python(86.7), Math Vision w/ python (93.2) What's new: 🔹Long-horizon coding - 4,000+ tool calls, over 12 hours of continuous execution, with generalization across languages (Rust, Go, Python) and tasks (frontend, devops, perf optimization). 🔹Motion-rich frontend - Videos in hero sections, WebGL shaders, GSAP + Framer Motion, Three.js 3D. 🔹Agent Swarms, elevated - 300 parallel sub-agents × 4,000 steps per run (up from K2.5's 100 / 1,500). One prompt, 100+ files. 🔹Proactive Agents - K2.6 model powers OpenClaw, Hermes Agent, etc for 24/7 autonomous ops. 🔹Claw Groups (research preview) - bring your own agents, command your friends', bots & humans in the loop. - K2.6 is now live on kimi.com in chat mode and agent mode. For production-grade coding, pair K2.6 with Kimi Code: kimi.com/code - 🔗 API: platform.moonshot.ai 🔗 Tech blog: kimi.com/blog/kimi-k2-6 🔗 Weights & code: huggingface.co/moonshotai/Kim…
Kimi.ai tweet media
English
910
2.4K
18.1K
7.4M
Cunxiang Wang
Cunxiang Wang@CunxiangWang·
@sheriyuo My schoolmates and I had around 6000-7500 CNY a month in the phd period while the housing is almostly free.
English
2
0
1
1.1K
Xiuyu Li
Xiuyu Li@sheriyuo·
We discussed about PhD student payroll: - China mainland: ~3,000 CNY - HK: 20,000-30,000 HKD - US: e.g. 4,000 USD in Princeton Considering that HKs cost of living is much lower than that of the US, so HK is really more suitable for PhD applications. My supervisor said: - The advantages you mentioned are that 1st tier cities exploit the scissors gap to extract surplus value from bottom tier laborers. - It is simply that HK happens to be close to the mainland. Do you believe that?😂😂😂
English
23
1
73
27.2K