CogInterp Workshop @ NeurIPS 2025 (@CogInterp) - Twitter Profili

Sabitlenmiş Tweet

CogInterp Workshop @ NeurIPS 2025@CogInterp·12 Tem

We’re excited to announce the first workshop on CogInterp: Interpreting Cognition in Deep Learning Models @ NeurIPS 2025! 📣 How can we interpret the algorithms and representations underlying complex behavior in deep learning models? 🌐 coginterp.github.io/neurips2025/ 1/

English

1

20

76

17.4K

CogInterp Workshop @ NeurIPS 2025 retweetledi

NYU Center for Data Science@NYUDataScience·27 Oca

Can LLMs evolve human-like semantic categories? CDS-affiliated @NogaZaslavsky and PhD student Nathaniel Imel show that, via simulated cultural transmission, LLMs reorganize color categories toward efficient compression. 🔗arxiv.org/abs/2509.08093

English

2

4

28

9.1K

CogInterp Workshop @ NeurIPS 2025 retweetledi

Ari Holtzman@universeinanegg·22 Ara

this slide is solid gold

Goodfire@GoodfireAI

Our last Stanford guest lecture - @EkdeepL on what counts as an explanation & a neuro-inspired "model systems approach" to interp Plus, how in-context learning and many-shot jailbreaking are explained by LLM representations changing in-context (as a case study for that approach) 00:33 - What counts as an explanation? 04:47 - Levels of analysis & standard interpretability approaches 18:19 - The "model systems" approach to interp [Case study on in-context learning] 23:36 - How LLM representations change in-context 44:10 - Modeling ICL with rational analysis 1:10:54 - Conclusion & questions Thanks again to @SuryaGanguli for having us in his class!

English

2

4

52

6.4K

CogInterp Workshop @ NeurIPS 2025 retweetledi

Goodfire@GoodfireAI·11 Ara

Our last Stanford guest lecture - @EkdeepL on what counts as an explanation & a neuro-inspired "model systems approach" to interp Plus, how in-context learning and many-shot jailbreaking are explained by LLM representations changing in-context (as a case study for that approach) 00:33 - What counts as an explanation? 04:47 - Levels of analysis & standard interpretability approaches 18:19 - The "model systems" approach to interp [Case study on in-context learning] 23:36 - How LLM representations change in-context 44:10 - Modeling ICL with rational analysis 1:10:54 - Conclusion & questions Thanks again to @SuryaGanguli for having us in his class!

English

3

27

139

31.4K

CogInterp Workshop @ NeurIPS 2025 retweetledi

Christopher Potts@ChrisGPotts·10 Ara

Safety-oriented interpretability researchers should be focused on AI systems, not individual model artifacts. A snippet from the NeurIPS CogInterp workshop panel on Sunday:

English

6

19

169

16.2K

CogInterp Workshop @ NeurIPS 2025 retweetledi

Noga Zaslavsky@NogaZaslavsky·8 Ara

Honored and thrilled that our work received the @CogInterp best paper award! 💫 📄 Extended paper: arxiv.org/pdf/2509.08093 🧵 Highlights: x.com/NogaZaslavsky/… @NeurIPSConf #NeurIPS2025

CogInterp Workshop @ NeurIPS 2025@CogInterp

Our Best Paper Award goes to Nathaniel Imel and Noga Zaslavsky @NogaZaslavsky for their excellent paper “Culturally transmitted color categories in LLMs reflect a learning bias toward efficient compression”!

English

2

6

35

4.2K

CogInterp Workshop @ NeurIPS 2025 retweetledi

Ari Holtzman@universeinanegg·8 Ara

this was so awesome. Jay still killin' it five decades later

CogInterp Workshop @ NeurIPS 2025@CogInterp

Jay McClelland, opens with a question, "Do LMs have thoughts?" Are LMs stochastic parrots or is there some understanding?

English

3

1

39

7.3K

CogInterp Workshop @ NeurIPS 2025@CogInterp·8 Ara

Our Best Paper Award goes to Nathaniel Imel and Noga Zaslavsky @NogaZaslavsky for their excellent paper “Culturally transmitted color categories in LLMs reflect a learning bias toward efficient compression”!

CogInterp Workshop @ NeurIPS 2025 tweet media

English

0

1

11

4.9K

CogInterp Workshop @ NeurIPS 2025 retweetledi

Justin Angel@JustinAngel·7 Ara

At the @CogInterp workshop at NeurIPS. coginterp.github.io/neurips2025/ This slide explains MechIntrep vs CongIntrep:

English

0

4

11

642

CogInterp Workshop @ NeurIPS 2025@CogInterp·8 Ara

We are about to start our panel discussion, join us for some hot takes about what cognitive interpretability should be about.

English

0

1

7

342

CogInterp Workshop @ NeurIPS 2025@CogInterp·8 Ara

Our final speaker @sydneymlevine makes a radical proposal: building computational models of human moral judgements to use as an AI system for making moral judgements.

English

0

3

208

CogInterp Workshop @ NeurIPS 2025@CogInterp·8 Ara

Jay proposes shifting from representing context as a sequence of tokens to a sequence of thoughts. The model learns a latent 'thought gestalt' from previous sentences to guide downstream prediction.

English

0

4

258

CogInterp Workshop @ NeurIPS 2025@CogInterp·8 Ara

Visualizing how LLMs handle object-property binding, he argues that even with scale, transformers might not be forming the kind of 'integrated representations' that human cognition relies on.

English

1

0

1

274

CogInterp Workshop @ NeurIPS 2025@CogInterp·8 Ara

Jay McClelland, opens with a question, "Do LMs have thoughts?" Are LMs stochastic parrots or is there some understanding?

English

3

1

18

8K

CogInterp Workshop @ NeurIPS 2025@CogInterp·8 Ara

A big crowd for Jay McClelland’s talk!

English

0

3

195

CogInterp Workshop @ NeurIPS 2025@CogInterp·8 Ara

Swing by a super happening poster session where ML and CogSci meet!

English

0

1

6

3K

CogInterp Workshop @ NeurIPS 2025@CogInterp·7 Ara

In our fourth spotlight talk, neural network legend Paul Smolensky uses symbolic programs such as production systems to understand how neural networks process symbols

English

0

3

21

2.8K

CogInterp Workshop @ NeurIPS 2025@CogInterp·7 Ara

For our third spotlight talk, Sonia Murthy @soniakmurthy uses probabilistic cognitive models to understand value trade-offs in LLMs that enable pragmatic reasoning about politeness in speech acts

English

0

3

166

CogInterp Workshop @ NeurIPS 2025@CogInterp·7 Ara

Erin Grant @ermgrant discusses dissociations between function and representation, and asks whether representational alignment is enough for understanding deep neural networks

English

1

10

455

CogInterp Workshop @ NeurIPS 2025 retweetledi

Sonia Murthy@soniakmurthy·7 Ara

Excited to be presenting our work on using cognitive models to interpret pluralistic values in LLMs once again as a spotlight talk 🌟 at the NeurIPS CogInterp workshop! Come by upper level room 5AB today and check out the paper here: arxiv.org/abs/2506.20666

CogInterp Workshop @ NeurIPS 2025@CogInterp

The spotlight talks will cover all aspects of interpreting cognition in deep learning models: from behavior to algorithms to representations! Also check out the list of poster presentations at coginterp.github.io/neurips2025/ac… (3/3)

English

0

2

8

991

CogInterp Workshop @ NeurIPS 2025

Keşfet