Patrick Butlin

27 posts

Patrick Butlin

@patrickbutlin

Philosopher @eleosai

Katılım Ağustos 2022

658 Takip Edilen697 Takipçiler

Patrick Butlin@patrickbutlin·3d

Link to the paper: arxiv.org/abs/2605.13339

English

Patrick Butlin@patrickbutlin·3d

Our research is complementary with Anthropic's concurrent work on emotion concepts (transformer-circuits.pub/2026/emotions/…); we used a different method to extract evaluative representations and studied how they interact with varying personas.

English

Patrick Butlin@patrickbutlin·3d

Another exciting @MATSprogram paper, this time from the brilliant @gilg_oscar. We found a direction in LLMs that apparently performs a persona-relative evaluative function in some very different contexts.

Oscar Gilg@gilg_oscar

First preprint! Working with @patrickbutlin during @MATSprogram. LLM Assistant personas like being helpful, evil personas like being harmful. We found that a single direction represents helping as good under the Assistant, and ‘harm’ as good under evil.

English

2.2K

Patrick Butlin@patrickbutlin·20 Nis

@MATSprogram @gilg_oscar link here: philpapers.org/archive/BECWIT…

English

162

Patrick Butlin@patrickbutlin·20 Nis

Many thanks to @MATSprogram for making our collaboration possible - and look out for another paper, with the equally excellent @gilg_oscar, coming soon!

English

247

Patrick Butlin@patrickbutlin·20 Nis

I'm proud to announce this new paper with my fantastic @MATSprogram fellow @BeckmannPierre, on personas and LLM individuation.

Pierre Beckmann@BeckmannPierre

New paper with @PatrickButlin, from my time at @MATSprogram . We propose two new candidates for LLM individuation: the (virtual) instance-persona view and the model-persona view. 🧵

English

5.7K

Patrick Butlin@patrickbutlin·3 Mar

5. 'Higher-order representation in AI' (unfortunately slightly dated already): philosophymindscience.org/index.php/phim…

English

216

Patrick Butlin@patrickbutlin·3 Mar

1. 'Desire in AI': philarchive.org/rec/BUTDIA 2. 'Are any machines conscious today?': philarchive.org/rec/BUTAAM-2 3. 'Testing for consciousness in current AI': philarchive.org/rec/BUTTFC 4. 'Consciousness and AI' encyclopaedia entry: oecs.mit.edu/pub/zf1nbs6d/

English

349

Patrick Butlin@patrickbutlin·3 Mar

Some recent papers:

English

683

Patrick Butlin@patrickbutlin·11 Kas

Many thanks to the editor and reviewers for @TrendsCognSci and especially to my co-authors, including @rgblong @Yoshua_Bengio @birchlse @davidchalmers42 @ConstantAxel @georgejwdeane @EricElmoznino @kanair @MatthiasMichel_ @Liad_Mudrik @meganakpeters @eschwitz and others!

English

1.4K

Patrick Butlin@patrickbutlin·11 Kas

The new paper is here: sciencedirect.com/science/articl…

English

1.5K

Patrick Butlin@patrickbutlin·11 Kas

New paper on AI consciousness! Here we present the theory-derived indicator method for assessing AI systems for consciousness. Link below.

English

332

28.7K

Patrick Butlin retweetledi

Eleos AI Research@eleosai·4 Eyl

We're thrilled to announce the first Eleos Conference on AI Consciousness and Welfare. Join us Nov 21-23, 2025 in Berkeley, CA for discussions on AI welfare with leading researchers from @nyuniversity, @Google, @AnthropicAI, & more.

English

110

27.4K

Patrick Butlin retweetledi

J. AI Research-JAIR@JAIR_Editor·26 Mar

New Article: "Principles for Responsible AI Consciousness Research" by Butlin and Lappas jair.org/index.php/jair…

English

603

Patrick Butlin@patrickbutlin·18 Mar

The challenge of AI moral status is part of a broader, global challenge: developing AI responsibly and preparing for its impact on society. I’m especially excited about helping Eleos to work out what success in meeting this challenge looks like—and taking actions to achieve it.

English

518

Patrick Butlin@patrickbutlin·18 Mar

The questions I’ll be working on at Eleos, about the conditions for consciousness and the grounds of moral status, are deeply interesting and important. I’m looking forward to renewing my collaboration with @rgblong and continuing to build the community of AI welfare researchers.

English

1.7K

Patrick Butlin@patrickbutlin·18 Mar

I’m happy to announce that I’m joining @eleosai. At Eleos, I’ll continue my work on the philosophy and science of AI minds and moral status. [1/3]

English

2.6K

Keşfet

@MATSprogram @gilg_oscar @BeckmannPierre @TrendsCognSci @rgblong @Yoshua_Bengio @birchlse @davidchalmers42