Xiaocong Yang

40 posts

Xiaocong Yang

Xiaocong Yang

@xy51_uiuc

PhD student @illinoisCS. Founder of AI Interpretability @ Illinois. Alumni @Tsinghua_uni

Urbana, IL Katılım Mayıs 2023
80 Takip Edilen21 Takipçiler
Sabitlenmiş Tweet
Xiaocong Yang
Xiaocong Yang@xy51_uiuc·
The recordings & full slides from my lecture series this semester, AI Interpretability in the Era of LLMs: Architecture, Behavior and Beyond (CS 591 BAI, Spring 2026) are now online! 🔗 interpretability.web.illinois.edu/tutorial-mater… The lectures present a unified view of AI interpretability, covering: 🏗️ Architectural anatomy in LLMs ⚙️ Computational mechanisms with Transformer Circuit Theory 🌀 Emergent behaviors in large models 🔄 Paradigm shift in interpretability research: from post-hoc interpretability to generative interpretability Huge thanks to Prof. ChengXiang Zhai, Prof. Rhanor Gillette, Prof. John Hart, Prof. Gerald F DeJong, Prof. Rainer Engelken, and all the students who attended and made the discussions so engaging! #AIInterpretability #Neurosymbolic #LLM #MachineLearning #UIUC
Xiaocong Yang tweet media
English
0
0
1
72
Xiaocong Yang
Xiaocong Yang@xy51_uiuc·
@supakjk Hi Joo-Kyung, are you still looking for interns for this position?
English
1
0
0
471
Joo-Kyung Kim
Joo-Kyung Kim@supakjk·
We are actively recruiting PhD research interns for 2026 at Amazon Alexa AI. We are particularly interested in candidates with experience and publication records in multi-turn/agentic reinforcement learning, or LLM with tool/skill/episodic memories. If you are interested in this opportunity, please email me at jookyk@amazon.com with your CV and a brief statement of research interests.
Joo-Kyung Kim@supakjk

We are recruiting PhD research interns for 2026. We focus on generative AI areas such as long-horizon reinforcement learning, LLM with tool/skill/episodic memories, LLMaaJ with non-verifiable rewards, and multi-modal agents, but not limited to these. The ideal candidates are current PhD students with 1st-author publications in top NLP/ML venues such as ACL, NAACL, EMNLP, NeurIPS, ICLR, and ICML. If you are interested in this opportunity, please email me (jookyk at amazon.com) with your CV. linkedin.com/jobs/view/4336…

English
2
10
149
17.6K
Xiaocong Yang
Xiaocong Yang@xy51_uiuc·
@willccbb @PrimeIntellect Hey Will, it sounds like a fun project to explore! Xiaocong here — CS PhD student at UIUC & founder of AI Interpretability @ Illinois research. Look forward to exchanging ideas with you! 💡
English
0
0
0
94
will brown
will brown@willccbb·
hiring 1-2 more interns this summer for Applied Research @primeintellect focus areas = agentic RL, data + evals, or forward-deployed in-person in SF, relo support provided, US work auth required (sorry), intended for current students DM me something sick you've been working on
English
33
38
634
79.2K
Xiaocong Yang
Xiaocong Yang@xy51_uiuc·
@Kimi_Moonshot In general, I agree the problem of bandwidth allocation in residual streams is very important for LLMs so nice to see progress on it!
English
0
0
0
45
Xiaocong Yang
Xiaocong Yang@xy51_uiuc·
Very interesting work, Kimi ( the human 😋) ! Also curious if you tried token-wise adaptive aggregation instead of the static version? In an ongoing project of my team, we emprically find per-token dynamic residual stream helps with performance — tho we’re using different architecture.
English
1
0
1
1.8K
Kimi.ai
Kimi.ai@Kimi_Moonshot·
Introducing 𝑨𝒕𝒕𝒆𝒏𝒕𝒊𝒐𝒏 𝑹𝒆𝒔𝒊𝒅𝒖𝒂𝒍𝒔: Rethinking depth-wise aggregation. Residual connections have long relied on fixed, uniform accumulation. Inspired by the duality of time and depth, we introduce Attention Residuals, replacing standard depth-wise recurrence with learned, input-dependent attention over preceding layers. 🔹 Enables networks to selectively retrieve past representations, naturally mitigating dilution and hidden-state growth. 🔹 Introduces Block AttnRes, partitioning layers into compressed blocks to make cross-layer attention practical at scale. 🔹 Serves as an efficient drop-in replacement, demonstrating a 1.25x compute advantage with negligible (<2%) inference latency overhead. 🔹 Validated on the Kimi Linear architecture (48B total, 3B activated parameters), delivering consistent downstream performance gains. 🔗Full report: github.com/MoonshotAI/Att…
Kimi.ai tweet media
English
336
2.1K
13.6K
5M
Xiaocong Yang
Xiaocong Yang@xy51_uiuc·
Honored to be invited to speak at the @Citadel GQS PhD Colloquium this April! I’ll be introducing our research initiative AI Interpretability @ Illinois, where we’re pushing the frontier of mechanistic and generative interpretability, and building principled foundations for next-generation AI models. Excited to engage with the community and shape what trustworthy AI should look like.
English
0
0
0
57
Merge Labs
Merge Labs@merge·
Hello world! 👋 We are Merge Labs – a research lab with the long-term mission of bridging biological and artificial intelligence to maximize human ability, agency and experience. Read more about it from our founding team + join us: merge.io
English
137
113
1K
291.2K
Xiaocong Yang
Xiaocong Yang@xy51_uiuc·
@PeterHndrsn Just applied and look forward to it! A CS PhD @UofIllinois leading AI Interpretability @ Illinois team; previously Econ undergraduate @Tsinghua_Uni interested in mechanism design and political philosophy.
English
0
0
0
209
Peter Henderson
Peter Henderson@PeterHndrsn·
A few more days to apply to MATS to work with me this summer on alignment! I'm also hiring visiting summer fellows and/or part-time visiting fellows to help me drive a few projects that are less alignment and more RL related. Links below!
English
5
6
132
11.5K
Séb Krier
Séb Krier@sebkrier·
Today I learnt that in 2009, neuroscientists placed a dead Atlantic salmon into an fMRI scanner, scanned it, and that this has apparently implications for AI interpretability. 🐟 They showed the dead fish pictures of humans in social situations and "asked" the fish to determine the emotions of the people. When they ran their standard statistical software, the results showed "brain activity" in the fish that correlated with the emotions. Obviously, the fish was not thinking; the "activity" was just random noise. The point of the study was to show that if you don't correct for statistical noise and use rigorous controls, your tools will find patterns where none exist. This paper claims that the same lesson should be applied in interpretability work: many researchers use various tools to explain what is happening inside a neural network (e.g. probes, SAEs etc). But some of these convincing-looking explanations can also be extracted when applied to randomly initialized and untrained AI models (the dead salmon equivalent): saliency maps remain plausible after weight randomization, sparse autoencoders find interpretable components in random transformers etc. The authors propose that we stop treating interpretability as "storytelling" and start treating it as statistical inference: doing null hypothesis testing, quantifying uncertainty more systematically, interpreting explanations as a simplified surrogate model etc. Although they also acknowledge that finding some signal in random networks doesn't automatically invalidate finding stronger signals in trained ones. I'm not interpretability researcher myself but would be curious to hear takes! arxiv.org/abs/2512.18792
Séb Krier tweet media
English
85
580
4.9K
388.6K
Xiaocong Yang
Xiaocong Yang@xy51_uiuc·
@peyrardMax As a PhD student doing XAI & former Econ undergraduate, I totally got what you meant Sir! More generally, current AI progress has been heavily relying on lightbulb ideas that usually “comes from nowhere”. We’re still at the Tycho era — we’re waiting for the Kepler and Newton.
English
0
0
1
30
Maxime Peyrard
Maxime Peyrard@peyrardMax·
Psychology, econometrics, or neuroscience, have faced similar difficulties and reacted by adopting methodological reforms and rigorous statistical (causal) frameworks. It is now our turn to build the methodological guardrails turning XAI into a pragmatic science.
English
2
0
1
112
Maxime Peyrard
Maxime Peyrard@peyrardMax·
New paper: The Dead Salmons of XAI Standard fMRI pipelines once detected predictive brain regions in a dead salmon! A striking warning about poor statistical methodology Now, XAI faces similar issues: many methods can yield plausible explanations even for randomized networks
English
1
2
9
227
Xiaocong Yang
Xiaocong Yang@xy51_uiuc·
@maksym_andr @ELLISInst_Tue @coeff_giving Congrats! 🎉 happy and relieved to see some nice people studying this important topic, after seeing too many caring the instrumental capability of models only. Can’t do a PhD with you :( but I and our research initiative are definitely interested in chatting with you!
English
0
0
1
32
Maksym Andriushchenko
Maksym Andriushchenko@maksym_andr·
Big news! Very excited to build my group at @ELLISInst_Tue with the support from @coeff_giving. We are hiring PhD students and postdocs (details are on my website). Please apply if you are interested in AI safety and alignment!
ELLIS Institute Tübingen@ELLISInst_Tue

The ELLIS Institute is proud to announce that @coeff_giving is supporting our Principal Investigator @maksym_andr with a grant of $1,000,000 to fund his research on AI safety. Find out more on our website: institute-tue.ellis.eu/en/news/pi-mak…

English
12
8
155
8.9K
Xiaocong Yang
Xiaocong Yang@xy51_uiuc·
“… see Claude’s internal activations helps to screen all traffic”, I’m excited to see @AnthropicAI are (finally) applying their Mech Interpretability tools for models at deployment. Maybe it’s worth trying to use neuro-symbolic arch for better efficiency to do this in production
Anthropic@AnthropicAI

New Anthropic Research: next generation Constitutional Classifiers to protect against jailbreaks. We used novel methods, including practical application of our interpretability work, to make jailbreak protection more effective—and less costly—than ever. anthropic.com/research/next-…

English
0
0
0
136
Xiaocong Yang
Xiaocong Yang@xy51_uiuc·
From generating better observational data point, to summarizing abstract and transferable laws ( ideally universal, to be honest ) that guide the future development. This is a basic scientific stance; otherwise we’ll keep relying on occasional lightbulb moments.
Ziming Liu@ZimingLiu11

New year's read 📔 -- "Physics of AI Requires Mindset Shifts." I argue that "Physics of AI" research is hard due to the current publishing culture. But there is a simple solution -- curiosity-driven open research. kindxiaoming.github.io/blog/2025/phys…

English
0
0
0
89
Xiaocong Yang
Xiaocong Yang@xy51_uiuc·
xiaocong-yang.github.io/personal-websi… My second blog post is out! 🤠 I discussed the relationship between neuro-symbolic systems, information compression, alignment and codified laws. Any feedback is much appreciated! 🥳
English
0
0
0
37