We're thrilled to announce the first Eleos Conference on AI Consciousness and Welfare.
Join us Nov 21-23, 2025 in Berkeley, CA for discussions on AI welfare with leading researchers from @nyuniversity, @Google, @AnthropicAI, & more.
The challenge of AI moral status is part of a broader, global challenge: developing AI responsibly and preparing for its impact on society. I’m especially excited about helping Eleos to work out what success in meeting this challenge looks like—and taking actions to achieve it.
The questions I’ll be working on at Eleos, about the conditions for consciousness and the grounds of moral status, are deeply interesting and important. I’m looking forward to renewing my collaboration with @rgblong and continuing to build the community of AI welfare researchers.
@TheDavidSJ@rgblong@conjurial@EricElmoznino Distinguishing environment from system is a very interesting issue, something I’d like to think more about, but perhaps GWT can just say there has to be two loops.
@patrickbutlin@rgblong@conjurial@EricElmoznino From a computational functionalist perspective, how do we distinguish the “environment” from the “system”? What if the tokens are just being used for an inner monologue before producing the “real” output?
@patrickbutlin@rgblong et al argue that transformers lack the recurrence indicator of phenomenal consciousness required by Global Workspace Theory. However, autoregressive decoder-only transformers are recurrent via the token sequence. What am I missing?
arxiv.org/abs/2308.08708
@rgblong@TheDavidSJ@conjurial@EricElmoznino I agree with Rob about modules. But also, I’m not sure the environment a system acts on can be its workspace. Otherwise any system with a causal loop through an environment would selectively “write to” and “read from” a workspace. I think GWT requires an inner loop.
@tom4everitt@MishaLaskin I think so, although a strange one. It seems to learn an RL algorithm during training then use it to select actions. Or maybe it creates a simulacrum which is an agent.
In our new work - Algorithm Distillation - we show that transformers can improve themselves autonomously through trial and error without ever updating their weights.
No prompting, no finetuning. A single transformer collects its own data and maximizes rewards on new tasks.
1/N