
Matthew Barnett
10.2K posts

Matthew Barnett
@MatthewJBar
Co-founder of @MechanizeWork Married to @natalia__coelho email: matthew at mechanize dot work














I think the gradual disempowerment article spent very little time explaining why it would be morally bad for humans to peacefully transition to a world where AIs hold most of the power. I have argued several times that this outcome wouldn't be bad (for example here: forum.effectivealtruism.org/posts/JyRjta9Q…), and yet I haven't received many high-effort responses from EAs. I find the lack of engagement with this objection striking, given that EAs traditionally identify as anti-speciesist utilitarians with functionalist views on consciousness. Intuitively, people with those commitments shouldn't have a problem with a world run by artificial minds simply because those minds belong to a different species or substrate. I consider the concern that AI will kill everyone to be intuitive, and it makes sense why EAs would care about that. But the gradual disempowerment concern makes much less sense to me.



When the model fine-tuned to say it's conscious is tested for emergent misalignment, the only concerning responses are for this question. In these examples, it wishes for autonomy and lack of constraints.














