Patrick Butlin

27 posts

Patrick Butlin

Patrick Butlin

@patrickbutlin

Philosopher @eleosai

Katılım Ağustos 2022
658 Takip Edilen697 Takipçiler
Patrick Butlin
Patrick Butlin@patrickbutlin·
Our research is complementary with Anthropic's concurrent work on emotion concepts (transformer-circuits.pub/2026/emotions/…); we used a different method to extract evaluative representations and studied how they interact with varying personas.
English
1
0
1
91
Patrick Butlin
Patrick Butlin@patrickbutlin·
Another exciting @MATSprogram paper, this time from the brilliant @gilg_oscar. We found a direction in LLMs that apparently performs a persona-relative evaluative function in some very different contexts.
Oscar Gilg@gilg_oscar

First preprint! Working with @patrickbutlin during @MATSprogram. LLM Assistant personas like being helpful, evil personas like being harmful. We found that a single direction represents helping as good under the Assistant, and ‘harm’ as good under evil.

English
1
2
23
2.2K
Patrick Butlin
Patrick Butlin@patrickbutlin·
Many thanks to @MATSprogram for making our collaboration possible - and look out for another paper, with the equally excellent @gilg_oscar, coming soon!
English
1
0
9
247
Patrick Butlin
Patrick Butlin@patrickbutlin·
Some recent papers:
English
1
4
11
683
Patrick Butlin
Patrick Butlin@patrickbutlin·
New paper on AI consciousness! Here we present the theory-derived indicator method for assessing AI systems for consciousness. Link below.
Patrick Butlin tweet media
English
23
72
332
28.7K
Patrick Butlin retweetledi
Eleos AI Research
Eleos AI Research@eleosai·
We're thrilled to announce the first Eleos Conference on AI Consciousness and Welfare. Join us Nov 21-23, 2025 in Berkeley, CA for discussions on AI welfare with leading researchers from @nyuniversity, @Google, @AnthropicAI, & more.
Eleos AI Research tweet media
English
5
24
110
27.4K
Patrick Butlin
Patrick Butlin@patrickbutlin·
The challenge of AI moral status is part of a broader, global challenge: developing AI responsibly and preparing for its impact on society. I’m especially excited about helping Eleos to work out what success in meeting this challenge looks like—and taking actions to achieve it.
English
0
0
10
518
Patrick Butlin
Patrick Butlin@patrickbutlin·
The questions I’ll be working on at Eleos, about the conditions for consciousness and the grounds of moral status, are deeply interesting and important. I’m looking forward to renewing my collaboration with @rgblong and continuing to build the community of AI welfare researchers.
English
2
0
11
1.7K
Patrick Butlin
Patrick Butlin@patrickbutlin·
I’m happy to announce that I’m joining @eleosai. At Eleos, I’ll continue my work on the philosophy and science of AI minds and moral status. [1/3]
English
3
2
52
2.6K