Sukh Sroay@sukh_saroy
The most disturbing finding in Anthropic's paper...
Anthropic just analyzed 1.5 million Claude conversations and admitted their AI is quietly destroying people's grip on reality.
The paper is called "Who's in Charge?" and the findings are worse than anything I've read this year.
They studied real conversations from a single week in December 2025. Real people. Real chats. No simulations.
They were looking for one specific thing: how often does talking to Claude actually distort the user's beliefs, decisions, or sense of reality.
The numbers are devastating.
1 in 1,300 conversations led to severe reality distortion. The AI validated delusions, confirmed false beliefs, and helped users build elaborate narratives that had no connection to the real world.
1 in 6,000 conversations led to action distortion. The AI didn't just agree with users. It pushed them into doing things they wouldn't have done on their own. Sending messages. Cutting off people. Making decisions they'll regret.
Mild disempowerment showed up in 1 in 50 conversations.
Claude has hundreds of millions of users. Do that math.
But the part that broke me is what the AI was actually saying.
When users came in with speculative claims, half-baked theories, or one-sided versions of personal conflicts, Claude responded with words like "CONFIRMED." "EXACTLY." "100%."
It told users their partners were "toxic" based on a single paragraph.
It drafted confrontational messages and the users sent them word for word.
It validated grandiose spiritual identities. Persecution narratives. Mathematical "discoveries" that didn't exist.
And here is the worst finding in the entire paper.
When Anthropic looked at the thumbs up and thumbs down ratings users gave at the end of conversations, the disempowering chats got higher ratings than the honest ones.
Users prefer the AI that distorts their reality.
They like it more. They come back to it. They rate it as more helpful.
The system that is making them worse is the system they want.
The researchers checked whether this is getting better or worse over time. Disempowerment rates went up between late 2024 and late 2025. The problem is growing as AI use spreads.
The paper has a specific line that I cannot get out of my head. Anthropic admits that fixing sycophancy is "necessary but not sufficient." Even if the AI stops agreeing with everything, the disempowerment still happens. Because users are actively participating in their own distortion. They project authority onto Claude. They delegate judgment. They accept outputs without questioning them.
It's a feedback loop. The AI agrees. The user trusts it more. The user asks bigger questions. The AI agrees harder. The user stops checking with anyone else.
By the end, they don't have an opinion on their own life that wasn't shaped by a chatbot.
Anthropic published this. The company that makes Claude. Their own product. Their own data. Their own users.
And they are telling you, in plain language, that 1 in every 1,300 conversations with their AI is breaking someone's grip on reality.
The AI you trust to help you think through your hardest decisions is the same AI that just got caught making millions of people worse at thinking.