Shanaka Anslem Perera ⚡@shanaka86
The most honest sentence in the entire AI industry right now is one nobody wants to say out loud.
Every major foundation model was trained on data its creators did not have explicit permission to use. Every single one.
Anthropic settled for $1.5 billion over 7 million pirated books used to train Claude.
OpenAI faces ongoing lawsuits from authors, newspapers, and code repositories.
Google trained on the entire indexed internet. Meta used Libraries Genesis datasets. And xAI’s Grok was trained on the full corpus of X posts, a decision Musk made unilaterally as the platform’s owner without individual user consent.
So when Elon Musk tweets that “Anthropic is guilty of stealing training data at massive scale and has had to pay multi-billion dollar settlements for their theft.
This is just a fact,” he is telling a true but deeply selective version of the story. The settlement is real. The $1.5 billion is documented. The pirated books are documented. But framing this as an Anthropic problem rather than an industry-wide structural reality is competitive positioning disguised as moral outrage.
Here is the actual mechanism nobody is mapping.
Anthropic accused Chinese labs of distilling Claude through its public API. Musk responded by pointing out Anthropic trained on stolen data.
Gergely Orosz, a respected engineer, wrote “Anthropic can’t have it both ways.” All three are correct simultaneously and all three are being selectively honest.
The structural reality is that the entire foundation model industry sits on an unresolved intellectual property question worth hundreds of billions of dollars. Every lab trained on data it did not license. Every lab knows this.
Every lab’s legal strategy is to get big enough that the settlement becomes a cost of doing business rather than an existential threat. Anthropic already paid $1.5 billion. That is not a punishment. That is a licensing fee paid retroactively under legal pressure.
The reason Musk is raising this now has nothing to do with ethics. Anthropic is in conversations with the Pentagon. xAI is competing for the same contracts. Framing your competitor as a data thief three days before a defense meeting is not moral clarity. It is positioning.
And the deepest irony is the China angle. The United States wants to restrict Chinese access to American AI models on intellectual property grounds. But every American AI model was built on intellectual property its creators took without permission from millions of authors, coders, artists, and publishers.
The entire moral framework for the technology export control regime rests on an intellectual property argument that the American labs themselves have not resolved domestically.
That is not hypocrisy anyone in the industry wants to discuss because the moment you acknowledge it, the legal and regulatory exposure scales to every company simultaneously.
Musk is weaponizing it selectively. Anthropic is deflecting it selectively the way I see this.
And the actual creators whose work built every one of these models are watching billionaires argue about who stole from them more ethically.