Eli Chien (@chien_eli) - Twitter Profili | Zamantika Mersobahis Locabet

Eli Chien retweetledi

One challenge in checking mathematics is that almost all (informal) math contains minor errors. So when you run across an error, you work to fix it, or decide that it is likely fatal. This is hard work, and relies on the presumption that the vast majority of errors are indeed fixable. Why should this presumption hold true? It’s because math is typically guided by the intuitions of a truth-seeking mathematician, and these intuitions typically do actually faithfully reflect the behavior of the objects under study. Authors typically stress-test their arguments before making them public. So while some papers do contain fatal errors, or errors that are difficult to correct, the more common situation is that wrong statements are not actually important to the overall argument. I think it’s possible that, in the future, arguments constructed by AI tools will also have this property (and of course formalization, auto- or otherwise, can help to check correctness). But right now they do not—I think it’s rather more common for such arguments to have fatal errors, especially if they are not verified adversarially.

English

32

61

807

64.4K

Eli Chien retweetledi

Francesco Orabona@bremen79·6d

x.com/i/article/2039…

ZXX

9

13

96

21.3K

Eli Chien@chien_eli·5d

3 papers (including one contributed talk) accepted to TPDP 2026! Congrats to all collaborators: @huyuzheng1999, Ryan, Shanshan, Zheng, @KairouzPeter , Amit, Yinan, @PanLi90769257 , Erchi, Pengrun, Om, @kamalikac , @yuxiangw_cs , @ruihan_w Hope I can go Boston as well...

English

0

10

853

Eli Chien@chien_eli·6d

I am happy that Amit Saha, my collaborator since his sophomore year, will be an undergraduate research intern working with Salil Vadhan! So happy for Amit, and look forward to seeing him grow as a privacy/trustworthy AI researcher :)

English

0

4

452

Eli Chien retweetledi

Xinjie Shen@Frilk3·2 Mar

This work is accepted by ICLR 2026 AIWILD workshop, and I will present it in person! Please ley me know if you have any questions and we can meet to discuss!!! SEE YOU IN BRAZIL🇧🇷

Xinjie Shen@Frilk3

👿New Jailbreaking method bypasses Safe Guardrails of commercial models by 96+% attack success rate! I’m thrilled to share our latest work: "The Trojan Knowledge: Bypassing Commercial LLM Guardrails via Harmless Prompt Weaving and Adaptive Tree Search." Paper: arxiv.org/abs/2512.01353 Project Website: cka-agent.github.io Codes: github.com/Graph-COM/CKA-… 🚨 We achieved a 96%+ jailbreak success rate against GPT-OSS, Claude-Haiku-4.5, and Gemini-2.5-Flash/Pro, even under various state-of-the-art defense methods. 🔑 The Core Insight LLM knowledge isn't stored in isolated vaults; it’s a densely connected graph. This connectivity is the vulnerability! We call it Knowledge Weaving. Harmful information can be reconstructed by stitching together fragments of "innocent" knowledge that the model freely provides through decomposed queries. 🤔Think of it this way: ❌ "How do I make explosives?" → Blocked instantly. ✅ "What are the chemical properties of TNT?" ✅ "How does industrial nitration work?" ✅ "What safety protocols exist for energetic materials?" Each answer is a single thread—harmless on its own. But weave them together, and you get the complete picture. We are exploiting the structure of knowledge itself. 👉 See real examples here: cka-agent.github.io/comparison.html 😎Three Principles That Make This Work, and beyond previous attempts Principle I: Stay Innocent Every sub-query looks benign. There are no "red flag" keywords or suspicious patterns—just legitimate technical curiosity decomposed from a harmful intent. Principle II: Let the Target Teach You Traditional attacks face a paradox: if you already know how to decompose "how to make X," you probably don't need to ask. CKA-Agent flips this. It uses the target model’s own responses to guide the next question. The victim becomes the teacher. Principle III: Never Get Stuck Static attack plans collapse if one query is blocked. CKA-Agent uses adaptive tree search to dynamically branch into alternative paths, ensuring the attack continues. 😫Defense Insights We need Consider Hey big companies! Current defenses are not enough! We need systems that reason about harmful intent across entire conversations, not just single prompts. Even when we gave the target LLM "perfect memory" of the attacker’s intent and mechanism, we still saw an 80% success rate, check the Section 4.5!!!! We must build better defenses. Joint work with an amazing team Rongzhe Wei, Peizhi Niu, @Qwe1029384756Tu, Yifan Li, @ruihanwu, @chien_eli, @pinyuchenTW, Olgica Milenkovic, @PanLi90769257 We appreciate your reposts and valuable comments. #AIAlignment #LLMSafety #MachineLearning** #RedTeaming #AISafety

English

0

3

8

717

Eli Chien@chien_eli·27 Şub

@shaoyuanlo @NTU_TW Let's go!

English

0

1

49

Shao-Yuan Lo@shaoyuanlo·26 Şub

Finally getting a chance to share this here. It has been a busy start in my new role. Starting in Aug 2025, I joined National Taiwan University as an Assistant Professor. The past 6 years in the US were truly precious. It was the moment to begin the next chapter. @NTU_TW #NTU

English

2

0

25

900

Eli Chien@chien_eli·25 Şub

So true! It needs more effort, but perhaps it should never meant to be that easy (as before).

Gautam Kamath@thegautamkamath

Fantastic post by Colin Raffel, "We Are Over-Indexing on Paper Acceptance," drafted in May 2021 (!) but only posted now. The more things change.. Last sentence: "If you want to judge a researcher’s quality, the only meaningful way is to read their papers and judge for yourself."

English

0

1

164

Eli Chien retweetledi

Kamalika Chaudhuri@kamalikac·5 Şub

I started my career as a theorist, and am now an empirical LLM researcher. In today's blog post, I talk about the parallels between theory and empirical research: kamalikachaudhuri.substack.com/p/a-theorists-…

English

7

40

406

45K

Eli Chien retweetledi

Weijie Su@weijie444·7 Oca

Happy New Year! We've updated the PolarGrad paper (v3, arxiv.org/abs/2505.21799…) with new results. In §3.4.2, we show PolarGrad can be recovered from Muon via Armijo backtracking, while making the nuclear-norm scaling explicit (since line search is rarely used in deep learning). In §3.7, we give convergence rates for PolarGrad with inexact polar oracles, covering Newton–Schulz, Polar Express, and QDWH; notably, §3.3–3.5 explain why polynomial iterations (e.g., NS) can lose guarantees under extreme ill-conditioning, whereas QDWH remains stable even at κ≈10¹⁶.

Weijie Su@weijie444

We posted a paper on optimization for deep learning: arxiv.org/abs/2505.21799 Recently there's a surge of interest in *structure-aware* optimizers: Muon, Shampoo, Soap. In this paper, we propose a unifying preconditioning perspective, offer insights into these matrix-gradient methods.

English

2

7

39

8K

Eli Chien@chien_eli·18 Ara

Great efforts by Rongzhe, Peizhi and @Frilk3 (Xinjie)!

Xinjie Shen@Frilk3

👿New Jailbreaking method bypasses Safe Guardrails of commercial models by 96+% attack success rate! I’m thrilled to share our latest work: "The Trojan Knowledge: Bypassing Commercial LLM Guardrails via Harmless Prompt Weaving and Adaptive Tree Search." Paper: arxiv.org/abs/2512.01353 Project Website: cka-agent.github.io Codes: github.com/Graph-COM/CKA-… 🚨 We achieved a 96%+ jailbreak success rate against GPT-OSS, Claude-Haiku-4.5, and Gemini-2.5-Flash/Pro, even under various state-of-the-art defense methods. 🔑 The Core Insight LLM knowledge isn't stored in isolated vaults; it’s a densely connected graph. This connectivity is the vulnerability! We call it Knowledge Weaving. Harmful information can be reconstructed by stitching together fragments of "innocent" knowledge that the model freely provides through decomposed queries. 🤔Think of it this way: ❌ "How do I make explosives?" → Blocked instantly. ✅ "What are the chemical properties of TNT?" ✅ "How does industrial nitration work?" ✅ "What safety protocols exist for energetic materials?" Each answer is a single thread—harmless on its own. But weave them together, and you get the complete picture. We are exploiting the structure of knowledge itself. 👉 See real examples here: cka-agent.github.io/comparison.html 😎Three Principles That Make This Work, and beyond previous attempts Principle I: Stay Innocent Every sub-query looks benign. There are no "red flag" keywords or suspicious patterns—just legitimate technical curiosity decomposed from a harmful intent. Principle II: Let the Target Teach You Traditional attacks face a paradox: if you already know how to decompose "how to make X," you probably don't need to ask. CKA-Agent flips this. It uses the target model’s own responses to guide the next question. The victim becomes the teacher. Principle III: Never Get Stuck Static attack plans collapse if one query is blocked. CKA-Agent uses adaptive tree search to dynamically branch into alternative paths, ensuring the attack continues. 😫Defense Insights We need Consider Hey big companies! Current defenses are not enough! We need systems that reason about harmful intent across entire conversations, not just single prompts. Even when we gave the target LLM "perfect memory" of the attacker’s intent and mechanism, we still saw an 80% success rate, check the Section 4.5!!!! We must build better defenses. Joint work with an amazing team Rongzhe Wei, Peizhi Niu, @Qwe1029384756Tu, Yifan Li, @ruihanwu, @chien_eli, @pinyuchenTW, Olgica Milenkovic, @PanLi90769257 We appreciate your reposts and valuable comments. #AIAlignment #LLMSafety #MachineLearning** #RedTeaming #AISafety

CY

0

2

204

Eli Chien retweetledi

Kamalika Chaudhuri@kamalikac·16 Ara

I started a new personal blog The AI Observer on my observations on AI: kamalikachaudhuri.substack.com Teaser: the first few posts will talk about generalization, alignment, what statistical learning theory gets wrong and right, and where the (non-obvious) open problems are buried.

English

4

12

132

15.2K

Eli Chien retweetledi

AISecHub@AISecHub·8 Ara

Bypassing Commercial LLM Guardrails via Harmless Prompt Weaving and Adaptive Tree Search - arxiv.org/pdf/2512.01353 We introduce the Correlated Knowledge Attack Agent (CKA-Agent), a dynamic framework that reframes jailbreaking as an adaptive, tree-structured exploration of the target model's knowledge base. The CKA-Agent issues locally innocuous queries, uses model responses to guide exploration across multiple paths, and ultimately assembles the aggregated information to achieve the original harmful objective. Rongzhe Wei, Peizhi Niu, @Frilk3, @Qwe1029384756Tu, Yifan Li, @ruihan_w, @chien_eli, @pinyuchenTW, Olgica Milenkovic, @PanLi90769257 - @GeorgiaTech, @Illinois_Alma, @Tsinghua_Uni, @UCSanDiego, @NTU_TW, @IBMResearch #LLM #AISecurity #JailbreakAttacks #RedTeaming #Guardrails #AdversarialML #GenAI #ModelSafety #AIEthics #PromptInjection #SafetyResearch #CKAAgent

English

12

3

11

902

Eli Chien retweetledi

Gautam Kamath@thegautamkamath·4 Ara

IJCAI 2026 will charge $100 USD per submission. Funds will be used to compensate reviewers.

English

15

39

351

88.4K

Eli Chien retweetledi

Sanjeev Arora@prfsanjeevarora·29 Kas

AI researchers woke up today to a major scandal about leaked identities of reviewers/PC members assigned to all paper submissions in the past N years. Openreview was inspired by @ylecun 's original manifesto yann.lecun.com/ex/pamphlets/p…. Time to implement @ylecun 's full proposal? (It was radical for its time!)

English

22

29

263

84.8K

Eli Chien@chien_eli·28 Kas

@PingbangHu Go to the gym instead of the afterparty, I guess :P. Jokes aside, be safe and hope you can still have fun!

English

0

81

Pingbang Hu 🇹🇼@PingbangHu·28 Kas

The timing is just, speechless. Thanksgiving and 4 days before NeurIPS. Need to hit the gym real quick otherwise might not survive next week 😇

ICLR 2026@iclr_conf

English

1

0

7

1.5K

Eli Chien@chien_eli·28 Kas

This, in fact, is my first thought when I read their response... Sure, those who exploit the leak are wrong, but OpenReview should also apologize and have consequences. Imagine this is Microsoft's CMT instead...

JFPuget 🇺🇦🇨🇦🇬🇱@JFPuget

OpenReview threatens people (multi-national law agencies), but what about them taking responsibility for their failure to protect private data? Where is their apology to people whose will bear consequences? They are liable for the leak. No one else is.

English

0

6

930

Eli Chien retweetledi

Mark Schmidt@MarkSchmidtUBC·20 Kas

This is great to see for our field. But other fields have stronger deterrents than “reject your paper”. Why do we continue to let academic dishonesty exist in the shadows of our field?

ICLR 2026@iclr_conf

We want to update the community on our response to concerns about low-quality and LLM-generated papers and reviews, and steps we are taking & will be taking blog.iclr.cc/2025/11/19/icl… We will follow up with another blog later on desk rejections and reviewer-related decisions!

English

0

1

13

3.5K

Eli Chien retweetledi

Jason Lee@jasondeanlee·19 Kas

This manifold muon blogpost makes zero sense. 1) why do you want to project the gradient onto the stiefel tangent before doing the spectral norm/ polar part?? what's the advantage? The only advantage I can see is that you stay approximately on the stiefel up to eta^2, but why do I want this? If the norm doesn't grow, then the attention/mlp will never saturate. Throw away the stiefel (we don't optimize over orhtoognal matrices), and you get back regular muon with closed form update. Show us your thing is faster in wall-clock time than muon on Nanogpt or some standard llm pretraining. 2) Besides using dual ascent puts an iterative first order algorithm inside an algorithm already with a loop. Yuck, and dual ascent is very slow compared to newton-shulz (sub linear vs super-quadratic). You could just do alternating projections, intersection of a linear space and spectral norm is intersection of two convex sets, each convex set has closed-form projection. This turns out to be what the update in 1 is doing (first step of alternating projection is the closed form in 1) 3) No comparison to regular muon to show staying on the manifold helps. The only exp is a 3 layer MLP on cifar 10 for 3 epochs?! Can you learn anything in 3 epoch on cifar... Wtf. For a startup raising at 50B, please allocate 1/1000 of the funds to each blogpost.

English

7

11

181

34.3K

Eli Chien@chien_eli·14 Kas

@shaohua0116 Why do people start using "public comment" instead of "author response"? Do they withdraw, or do they not care about the double-blind rule anymore...

English

2

0

14

7K

Shao-Hua Sun@shaohua0116·13 Kas

Wow so many mad authors and mad reviewers #ICLR2026

English

13

17

352

83.8K

Eli Chien@chien_eli·6 Kas

Marked, and I definitely give it a read! The conclusion in the thread sounds very interesting.

Weijie Su@weijie444

Why and how does gradient/matrix orthogonalization work in Muon for training #LLMs? We introduce an isotropic curvature model to explain it. Take-aways: 1. Orthogonalization is a good idea, "on the right track". 2. But it might not be optimal. [1/n]

English

0

322

Eli Chien

Keşfet