Robin Jia (@robinomial) - Twitter Profili | Zamantika Mersobahis Locabet

Robin Jia retweetledi

Ikuya Yamada@ikuyamada·10 Tem

🚀The ICML 2026 EMM-QA Workshop is happening tomorrow!🚀 Join us at 8:00 AM in ASEM Ballroom 201 at COEX. The program features four excellent invited talks, paper presentations, shared challenge highlights, and the live AI QA competition! Program: qanta-org.github.io/competition/20…

English

0

4

12

9.9K

Robin Jia@robinomial·11 Tem

Join me at 2:10pm today in ASEM Ballroom 201 to hear my reflections on my Adversarial Evaluation PhD work and my thoughts on how the ChatGPT Era represents a new Golden Age for the Adversarial Mindset!

Jordan Boyd-Graber@boydgraber

Come see @sewon__min @mrinmayasachan @robinomial (and Jenny Ni / Naman Goyal, whose handles I can't find).

English

0

13

1.2K

Robin Jia retweetledi

Jordan Boyd-Graber@boydgraber·10 Tem

Come see @sewon__min @mrinmayasachan @robinomial (and Jenny Ni / Naman Goyal, whose handles I can't find).

Ikuya Yamada@ikuyamada

🚀The ICML 2026 EMM-QA Workshop is happening tomorrow!🚀 Join us at 8:00 AM in ASEM Ballroom 201 at COEX. The program features four excellent invited talks, paper presentations, shared challenge highlights, and the live AI QA competition! Program: qanta-org.github.io/competition/20…

English

0

4

11

3.4K

Robin Jia@robinomial·6 Tem

Please reach out if you'd like to chat! What I'm thinking about most these days: interpretability, benchmarking, societal effects of AI, how technical AI research can inform law & policy

English

1

4

291

Robin Jia@robinomial·6 Tem

Heading to ICML where I’ll be presenting 3 papers on differentially private synthetic data, mech interp/ML theory for TFs solving graph problems, & mech interp for LLM negation processing, and giving a talk on the legacy of adversarial evaluation at the EMM-QA workshop!

English

4

45

3.2K

Robin Jia@robinomial·6 Tem

Schedule: Tuesday 10:30am: TFs + graphs arxiv.org/abs/2510.19753 Wednesday 10:30am: negation arxiv.org/abs/2605.03052 Thursday 10:30am: DP synthetic data arxiv.org/abs/2602.21218 Saturday 2:10pm: Talk at EMM-QA

English

0

4

8

432

Robin Jia retweetledi

Johnny Tian-Zheng Wei@johntzwei·3 Haz

Hi all, mentoring junior students can be a special experience. Contrary to common wisdom, my mentorship style is that of "apprenticeship" and my junior students work side-by-side with me on the most important part of my most important project. My reflections are below... [1/2]

English

1

2

18

1.3K

Robin Jia retweetledi

Wang Bill Zhu@BillJohn1235813·4 Haz

Sadly I can't attend ICRA this year, but right now Mason @miaosenc is presenting our work PSALM-V at Poster 250. Come by and check it out! 🤖! It's a project I'm really proud of: instead of hand-writing the symbolic rules a planner needs (PDDL pre-/post-conditions), PSALM-V lets an agent figure them out on its own by interacting with a visual, partially observed world. Mason would love to chat about it; go say hi! 👋

English

1

7

16

1.1K

Robin Jia retweetledi

Ting-Yun Chang@CharlotteTYC·4 Haz

This is the final project of my PhD journey 🎓 I've thought a lot about how to make interp actionable in my previous projects. I believe efficiency follows naturally: when we have a deep understanding of the model, we can figure out where to be frugal w/o hurting model accuracy. The Attention Sink and LLM.int8() papers set great examples, and they deeply inspire our paper. Mirroring the findings on value-state drain, we find that large-range value states are equally important in KV cache eviction. Evicting these outliers causes reasoning models to enter an endless self-reflection loop, while keeping them in the cache maintains accuracy. I'm extremely grateful to my amazing coauthors and supportive advisors.

Deqing Fu@DeqingFu

Introducing VaSE: Value-Aware Stochastic KV Cache Eviction. Reasoning models think in CoT, bloating the KV cache. Eviction caps memory but suffers capability drop. VaSE is a training-free recipe that cuts that cost: keep large-magnitude value states, evict stochastically.

English

2

4

22

2.4K

Robin Jia retweetledi

Harvey Yiyun Fu@harveyiyun·4 Haz

Excited to share our new paper on KV Cache eviction! We propose a new recipe that is simple and effective: 1. keep those value states with large magnitude, 2. add some stochasticity to the eviction process. Combined, VaSE consistently outperform previous eviction methods with high throughputs, while maintaining constant memory footprints. Huge thanks to all collaborators @CharlotteTYC, @DeqingFu, @chrome1996, and advisors @_jessethomason_ and @robinomial

Deqing Fu@DeqingFu

Introducing VaSE: Value-Aware Stochastic KV Cache Eviction. Reasoning models think in CoT, bloating the KV cache. Eviction caps memory but suffers capability drop. VaSE is a training-free recipe that cuts that cost: keep large-magnitude value states, evict stochastically.

English

0

3

10

2.7K

Robin Jia retweetledi

Deqing Fu@DeqingFu·4 Haz

Introducing VaSE: Value-Aware Stochastic KV Cache Eviction. Reasoning models think in CoT, bloating the KV cache. Eviction caps memory but suffers capability drop. VaSE is a training-free recipe that cuts that cost: keep large-magnitude value states, evict stochastically.

English

1

5

28

50.8K

Robin Jia retweetledi

Wang Bill Zhu@BillJohn1235813·1 Haz

🚨 [New preprint] Can AI assistants hurt the very people who depend on them? Raine v. OpenAI alleges ChatGPT contributed to a teen's suicide; OpenAI's 2025 "sycophancy" retrospective on GPT-4o. The pattern: harm comes not from capability failures, but from the social dynamics of how models talk to us, especially when users open up. We introduce EUDAIMONIA, a benchmark grounded in a Social AI Design Code rooted in real-world harm cases. 🌐 Project page: eudaimonia-bench.github.io 📄 Paper: arxiv.org/abs/2605.30654

English

3

5

22

1.9K

Robin Jia@robinomial·29 May

Being Johnny’s PhD advisor has not only been a great privilege, but it has forever changed my research vision. His work combining AI, law, and statistics opened my eyes to how technical research can guide policy and promote AI accountability. Excited for his next work as Dr. Wei!

Johnny Tian-Zheng Wei@johntzwei

Hi all, I defended my PhD thesis. My thesis in two sentences: Current AI measurement takes LLMs as fixed objects, which constrains us to observational measurement. *Spiking* the training data (inserting certain data at known rates), enables statistically principled measurement.

English

0

56

6.8K

Robin Jia retweetledi

Johnny Tian-Zheng Wei@johntzwei·28 May

Hi all, I defended my PhD thesis. My thesis in two sentences: Current AI measurement takes LLMs as fixed objects, which constrains us to observational measurement. *Spiking* the training data (inserting certain data at known rates), enables statistically principled measurement.

English

28

6

179

18.2K

Robin Jia retweetledi

Amin Banayeeanzade@Amin__Bana·28 May

Does your GPT-5.5 also love Valparaíso in Chile 🇨🇱 !? Ask it to “Name a random city in the world”. You might expect a broad sample from thousands of cities. Instead, models collapse to the same small set of answers again and again. 😵‍💫 But why do LLMs lack diversity? Why are they not reliable random number generators? Why do they still struggle with genuinely creative writing? And why do decoding tricks like temperature, top-k, and top-p often fail to recover meaningful diversity? We have some answers in our new paper! 🧪 Demo: diversitycalibration.github.io/index.html 📄 Paper: arxiv.org/abs/2605.11128

English

1

3

9

2.2K

Robin Jia retweetledi

Johnny Tian-Zheng Wei@johntzwei·27 May

🧵[1/5] Works on test set contamination focus on detection, but we show *correction* of inflated test scores is possible. arxiv.org/abs/2605.24818 Our proposal is to spike the training data and insert some test examples at known rates. The spiked examples are used to calibrate...

English

1

10

33

4.8K

Robin Jia retweetledi

Blaise Agüera (@blaiseaguera.bsky.social)@blaiseaguera·14 May

Just as single cells became multicellular life, 8B+ brains are now joining with AI to form a collective superintelligence. At @USC's Institute on Ethics and Trust in Computing inaugural summit, @robinomial, Jinchi Lv, @paria_rd and I discussed navigating this transition.

Blaise Agüera (@blaiseaguera.bsky.social) tweet media

English

1

3

30

2.8K

Robin Jia retweetledi

Ai2@allen_ai·8 May

Today we’re releasing EMO, a new mixture-of-experts (MoE) model trained so modular structure emerges directly from data without human-defined priors. EMO can use a small subset of its experts for a given task while keeping near full-model performance. 🧵

English

13

57

404

88.7K

Robin Jia retweetledi

Ryan Yixiang Wang@RyanYixiang·8 May

MoEs are everywhere in frontier models, and they are deployed as a monolith system. But many applications only need a narrow slice of capabilities, e.g., math, code, biomedical, etc. So what if "modularity" is actually the missing opportunity for MoEs? Today, we're releasing EMO: an end-to-end pretrained MoE where modularity emerges naturally, enabling selective use of experts!

Ai2@allen_ai

Today we’re releasing EMO, a new mixture-of-experts (MoE) model trained so modular structure emerges directly from data without human-defined priors. EMO can use a small subset of its experts for a given task while keeping near full-model performance. 🧵

English

7

72

531

118.9K

Robin Jia retweetledi

Deqing Fu@DeqingFu·2 May

Glad to share that this paper is accepted to #ICML 2026 @icmlconf with an updated title "Transformers Provably Learn Algorithmic Solutions for Graph Connectivity, But Only with the Right Data". 🥳

Deqing Fu@DeqingFu

Why do Transformers fail at algorithmic reasoning? We find it's not a lack of power, but a capacity mismatch. Our new preprint proves a tight, non-asymptotic bound: an L-layer model can only solve graph connectivity on graphs with a diameter up to exactly 3^L. arxiv.org/abs/2510.19753 🧵(1/N)

English

2

3

33

3.7K

Robin Jia

Keşfet