Gary Marcus
55.9K posts

Gary Marcus
@GaryMarcus
“In the aftermath of GPT-5’s launch … the views of critics like Marcus seem increasingly moderate.” —@newyorker






OpenAI acquiring @tbpn makes zero sense to me (an M&A professor).




Today, we closed our latest funding round with $122 billion in committed capital at an $852B post-money valuation. The fastest way to expand AI’s benefits is to put useful intelligence in people’s hands early and let access compound globally. This funding gives us resources to lead at scale. openai.com/index/accelera…

Anthropic made a very different kind of acquisition than OpenAI today 👀 theinformation.com/articles/anthr…


Thanks for writing the piece, @erichorvitz! I think "causal inference" should be put front and center since, as part of the AI community, we could help provide sound foundations for many challenges (including safety, equity, robustness, transparency, and understanding). Happy to help! (Just an example from ACM's book in honor of @yudapearl's Turing, perhaps timely in the context of Avi's announcement today: causalai.net/r60.pdf Here is another one in terms of fairness & equity: causalai.net/r90.pdf)

New terrible definition of superintelligence just dropped:

🚨 BREAKING: OpenAI and Google are about to have a massive legal problem. OpenAI, Google, and Anthropic have repeatedly sworn to courts that their models do not store exact copies of copyrighted books. They claim their "safety training" prevents regurgitation. Researchers just dropped a paper called "Alignment Whack-a-Mole" that proves otherwise. They didn't use complex jailbreaks or malicious prompts. They just took GPT-4o, Gemini, and DeepSeek, and fine-tuned them on a normal, benign task: expanding plot summaries into full text. The safety guardrails instantly collapsed. Without ever seeing the actual book text in the prompt, the models started spitting out exact, verbatim copies of copyrighted books. Up to 90% of entire novels, word-for-word. Continuous passages exceeding 460 words at a time. But here is the part that changes everything. They fine-tuned a model exclusively on Haruki Murakami novels. It didn't just learn Murakami. It unlocked the verbatim text of over 30 completely unrelated authors across different genres. The AI wasn't learning the text during fine-tuning. The text was already permanently trapped inside its weights from pre-training. The fine-tuning just turned off the filter. It gets worse. They tested models from three completely different tech giants. All three had memorized the exact same books, in the exact same spots. A 90% overlap. It's a fundamental, industry-wide vulnerability. For years, AI companies have argued in court that their models are just "learning patterns," not storing raw data. This paper provides the smoking gun.












