Chandan Singh

403 posts

Chandan Singh

@csinva

Seeking superhuman explanations. Senior researcher @MSFTResearch, PhD from @Berkeley_AI

Seattle, WA Katılım Şubat 2018

589 Takip Edilen1.3K Takipçiler

Sabitlenmiş Tweet

Chandan Singh@csinva·3 Eki

Science faces an explainability crisis: ML models can predict many natural phenomena but can't explain them We tackle this issue in language neuroscience by using LLMs to generate *and validate* explanations with targeted follow-up experiments arxiv.org/abs/2410.00812 1/2

English

143

21.5K

Chandan Singh@csinva·17h

PSA for american citizens going to ICLR: you need to get an evisa for Rio (takes up to 10 business days)

English

353

Chandan Singh retweetledi

Aruna S@arunasank·6d

Interpretability methods usually study single-token behavior. But real model behaviors, like sycophancy or writing style, are diffuse across many tokens. Can these diffuse behaviors be localized and controlled from long-form responses? YES!

GIF

English

100

9.3K

Chandan Singh retweetledi

abakalova@abakalova13175·4 Mar

Can we rewrite Transformers as a human-readable code? In this paper, we decompile Transformers trained on algorithmic and formal language tasks into D-RASP – a programming language that mirrors Transformer architecture. 🧵

English

236

23.8K

Chandan Singh retweetledi

Alexander Doria@Dorialexander·25 Şub

so apparently you can just ask models to write weights now.

N8 Programs@N8Programs

Beat it by having Codex hand-craft weights: gist.github.com/N8python/02e41… 100% accuracy on 10 million random test cases w/ only 343 parameters. As a bonus, it uses the vanilla Qwen3 architecture, just with the right weights.

English

1.5K

148.5K

Chandan Singh@csinva·25 Şub

Realizing that my brain now associates seeing a typo in a paragraph as a positve signal for quality

English

498

Chandan Singh retweetledi

Weiyang Liu@Besteuler·6 Şub

Orthogonal Finetuning (oft.wyliu.com; boft.wyliu.com) has a unique advantage of preventing catastrophic forgetting. Inspired by this property, we find that merging models within the orthogonal group can effectively reduce model conflicts and preserve both pretraining and downstream knowledge. This is our OrthoMerge framework. The idea behind OrthoMerge is extremely simple. For OFT-tuned models, we can first map the orthogonal adapters to Lie algebra with inverse Carley transform and then perform merging there. This guarantees the merged model differs from the pretrained model only up to an orthogonal transformation. A better news is that OrthoMerge can also be applied to non-OFT-tuned models. By solving the orthogonal procrustes problem, we can have the projected component of the adapter onto the orthogonal group. OrthoMerge will then be applied there and the residual component can be merged using conventional merging methods. That said, OrthoMerge can be used together with existing model merging methods! This is a great example of simple yet effective ideas. Great efforts by my PhD students Sihan Yang and Kexuan Shi. The project is already open-sourced and feel free to give it a try! Project: spherelab.ai/OrthoMerge/ Paper: arxiv.org/pdf/2602.05943 Code: github.com/Sphere-AI-Lab/…

English

339

21.6K

Chandan Singh@csinva·6 Şub

Really nice post! A plausible and well-articulated vision for mech. interp that really improves serious model training

Tom McGrath@banburismus_

We’re putting more computation (in the form of intelligence) into the most general object in neural network training: backprop. This essay describes how I think we can do this, why interp is key, the relevance to alignment, and how we should do it right.

English

1.7K

Chandan Singh retweetledi

Yufan Zhuang@yufan_zhuang·4 Şub

Can LLMs self-improve without ground-truth rewards? 🔄Introducing Test-time Recursive Thinking (TRT) Models recursively refine rollout strategies and accumulate knowledge from their own attempts. 🚀 Results: 1. 100% Accuracy on AIME-25 2. 10.4-14.8 pp improvement on LCB Hard

English

467

43K

Chandan Singh@csinva·29 Oca

@ChenhaoTan This repo has been tracking ICML (and other confs) well: github.com/lixin4ever/Con…

English

Chenhao Tan@ChenhaoTan·26 Oca

where can I easily find the number of submissions to ICML over time, especially historical ones? papercopilot.com/statistics/icm… This page missed even 2013, 2014, and 2016.

English

944

Chandan Singh retweetledi

Žiga Avsec@Avsecz·28 Oca

AlphaGenome is out in @nature today along with model weights! 🧬 📄 Paper: nature.com/articles/s4158… 💻 Weights: github.com/google-deepmin… Getting here wasn’t a straight path. We sat down @googledeepmind to discuss the story behind the model, paper & API: youtu.be/V8lhUqKqzUc

YouTube

English

486

1.9K

222.9K

Chandan Singh retweetledi

Goodfire@GoodfireAI·28 Oca

We've identified a novel class of biomarkers for Alzheimer's detection - using interpretability - with @PrimaMente. How we did it, and how interpretability can power scientific discovery in the age of digital biology: (1/6)

English

224

1.7K

393.5K

Chandan Singh retweetledi

David Bau@davidbau·23 Oca

Generated CoT is a fascinating window into modern LMs, but are these internal monologues as readable as they seem, or are they actually a "private language"? @kpal_koyena explores this in a clever way, by asking how one model's CoT works when fed to a different model....

Koyena Pal@kpal_koyena

Can models understand each other's reasoning? 🤔 When Model A explains its Chain-of-Thought (CoT) , do Models B, C, and D interpret it the same way? Our new preprint with @davidbau and @csinva explores CoT generalizability 🧵👇 (1/7)

English

11.2K

Chandan Singh retweetledi

Mert Yuksekgonul@mertyuksekgonul·22 Oca

How to get AI to make discoveries on open scientific problems? Most methods just improve the prompt with more attempts. But the AI itself doesn't improve. With test-time training, AI can continue to learn on the problem it’s trying to solve: test-time-training.github.io/discover.pdf

English

168

753

373.4K

Chandan Singh retweetledi

Koyena Pal@kpal_koyena·22 Oca

English

208

24.7K

Chandan Singh@csinva·20 Oca

Paper: arxiv.org/abs/2601.09072 Code: github.com/jjfenglab/HACHI Led by the wonderful @Jean_J_Feng and Avni Kothari!

English

241

Chandan Singh@csinva·20 Oca

Really excited about our new work, which makes building clinical prediction models way easier! AI agents do the grunt work of hypothesizing and validating EHR features, enabling easy auditing by clinicians Iterating this process yields sensible SOTA (fully interpretable!) models

English

919

Chandan Singh@csinva·13 Ara

Here is the intern job posting! apply.careers.microsoft.com/careers/job/19…

English

188

Chandan Singh@csinva·1 Ara

FTE job posting: apply.careers.microsoft.com/careers/job/19… Internship job posting coming soon...

English

278

Chandan Singh@csinva·1 Ara

I’ll be at NeurIPS helping to hire for MSR FTE roles, research interns (esp. in LLM interpretability), & presenting these papers — DM me if you’d like to meet up!

English

1.3K

Chandan Singh retweetledi

Yiping Wang@ypwang61·1 Ara

8B model can outperform AlphaEvolve on open optimization problems by scaling compute for inference or test-time RL🚀! ⭕Circle packing: AlphaEvolve (Gemini-2.0-Flash/Pro) : 2.63586276 Ours (DeepSeek-R1-0528-Qwen3-8B) : 2.63598308 🔗in🧵 [1/n]

English

197

44.1K

Chandan Singh@csinva·1 Ara

Bayesian concept bottleneck models poster Wed evening x.com/Jean_J_Feng/st…

Jean Feng@Jean_J_Feng

“Bayesian concept bottleneck models using LLM priors” (BC-LLM) was accepted at #neurips2025! BC-LLM tackles a key clinical AI problem: how do we find interpretable features for predicting an outcome when there are ♾ possibilities from the electronic health record (e.g. notes)

English

135

Chandan Singh@csinva·1 Ara

Generalized induction head poster Thurs evening x.com/csinva/status/…

Chandan Singh@csinva

Mechanistic interp has made cool findings but struggled to make them useful We show that "induction heads" found in LLMs can be reverse-engineered to yield accurate & interpretable next-word prediction models Led by @eunjikim4747 & Sriya Mantena arxiv.org/abs/2411.00066 🧵1/2

English

161

Keşfet

@ChenhaoTan @Nature @GoogleDeepMind @PrimaMente @kpal_koyena @davidbau @Jean_J_Feng @elonmusk