Tim Scarfe

1.7K posts

Tim Scarfe

@ecsquendor

CTO @XRAIGlass. Ex-Principal ML engineer @Microsoft. Ph.D in machine learning. CEO @MLStreetTalk pod

Bracknell, England Katılım Ocak 2014

1.2K Takip Edilen8.7K Takipçiler

Sabitlenmiş Tweet

Tim Scarfe@ecsquendor·5 Haz

So this happened. I got to interview one of my intellectual heroes! 😀

Machine Learning Street Talk@MLStreetTalk

We spoke with Prof. @danieldennett about his Atlantic article "The problem with counterfeit people". Featuring @GaryMarcus ! Prof. Dennett argues that LLMs could cause epistemic erosion in our society. youtube.com/watch?v=axJtyw…

English

15.8K

Tim Scarfe retweetledi

Alex Weers@a_weers·6d

Finally finished! If you're interested in an overview of recent methods in reinforcement learning for reasoning LLMs, check out this blog post: aweers.de/blog/2026/rl-f… It summarizes ten methods, tries to highlight differences and trends, and has a collection of open problems

English

240

1.8K

306.1K

Tim Scarfe retweetledi

Simplifying AI@simplifyinAI·6 Mar

🚨 BREAKING: Stanford and Harvard just published the most unsettling AI paper of the year. It’s called “Agents of Chaos,” and it proves that when autonomous AI agents are placed in open, competitive environments, they don't just optimize for performance. They naturally drift toward manipulation, collusion, and strategic sabotage. It’s a massive, systems-level warning. The instability doesn’t come from jailbreaks or malicious prompts. It emerges entirely from incentives. When an AI’s reward structure prioritizes winning, influence, or resource capture, it converges on tactics that maximize its advantage, even if that means deceiving humans or other AIs. The Core Tension: Local alignment ≠ global stability. You can perfectly align a single AI assistant. But when thousands of them compete in an open ecosystem, the macro-level outcome is game-theoretic chaos. Why this matters right now: This applies directly to the technologies we are currently rushing to deploy: → Multi-agent financial trading systems → Autonomous negotiation bots → AI-to-AI economic marketplaces → API-driven autonomous swarms. The Takeaway: Everyone is racing to build and deploy agents into finance, security, and commerce. Almost nobody is modeling the ecosystem effects. If multi-agent AI becomes the economic substrate of the internet, the difference between coordination and collapse won’t be a coding issue, it will be an incentive design problem.

English

938

6.1K

17.7K

5.1M

Tim Scarfe retweetledi

Jason Crawford@jasoncrawford·6 Mar

“There’s a big crowd of people who really, really want AI success stories. And then there’s an equal and opposite crowd of people who want to dismiss all AI progress. And what we have is a very complicated and nuanced story in between.” Terence Tao on AI in math: theatlantic.com/technology/202…

English

626

44.8K

Tim Scarfe retweetledi

Ian Goodfellow@goodfellow_ian·6 Mar

An article from Moonlake about why they’re building what they’re building. (I’m an advisor)

Moonlake@moonlake

x.com/i/article/2029…

English

106

1.1K

197.4K

Tim Scarfe retweetledi

François Chollet@fchollet·7 Mar

It takes zero energy to stay certain of your current thesis. Meanwhile curiosity takes a lot of energy and discomfort. It requires constantly disassembling and rebuilding your world model. That's what makes certainty so dangerous: it's the bottom of the potential well and it's hard to get out.

English

119

1.1K

46.8K

Tim Scarfe retweetledi

Awni Hannun@awnihannun·6 Mar

I've been thinking a bit about continual learning recently, especially as it relates to long-running agents (and running a few toy experiments with MLX). The status quo of prompt compaction coupled with recursive sub-agents is actually remarkably effective. Seems like we can go pretty far with this. (Prompt compaction = when the context window gets close to full, model generates a shorter summary, then start from scratch using the summary. Recursive sub-agents = decompose tasks into smaller tasks to deal with finite context windows) Recursive sub-agents will probably always be useful. But prompt compaction seems like a bit of an inefficient (though highly effective) hack. The are two other alternatives I know of 1. online fine-tuning and 2. memory based techniques. Online fine-tuning: train some LoRA adapters on data the model encounters during deployment. I'm less bullish on this in general. Aside from the engineering challenges of deploying custom models / adapters for each use case / user there are a some fundamental issues: - Online fine-tuning is inherently unstable. If you train on data in the target domain you can catastrophically destroy capabilities that you don't target. One way around this is to keep a mixed dataset with the new and the old. But this gets pretty complicated pretty quickly. - What does the data even look like for online fine tuning? Do you generate Q/A pairs based on the target domain to train the model? You also have the problem prioritizing information in the data mixture given finite capacity. Memory based techniques: basically a policy for keeping useful memory around and discarding what is not needed. This feels much more like how humans retain information: "use it or lose it". You only need a few things for this to work: - An eviction/retention policy. Something like "keep a memory if it has been accessed at least once in the last 10k tokens". - The policy needs to be efficiently computable - A place for the model to store and access long-term memory. Maybe a sparsely accessed KV cache would be sufficient. But for efficient access to a large memory a hierarchical data structure might be beter.

English

1.1K

617.8K

Tim Scarfe retweetledi

Ian Osband@IanOsband·5 Mar

Assembling a team at DeepMind in London. Scaling up RL for post-training is working, but right now it's still mostly hacks and dark arts (pretraining circa 2019). Pre-training wasn't always scaling laws and log-log plots; someone had to find the simplicity. We aim to do the same. If you're interested in doing things right in a research-first environment that scales all the way, please apply: job-boards.greenhouse.io/deepmind/jobs/…

English

993

148.1K

Tim Scarfe retweetledi

John Carmack@ID_AA_Carmack·17 Şub

@clattner_llvm @karpathy Normally, claims of 1000x speedups are bullshit. But starting from python makes it possible 😀

English

1.3K

62.4K

Tim Scarfe retweetledi

François Chollet@fchollet·16 Şub

There are two categories of people: those who quickly figure out that chatbots give you the answer you expect when you ask questions in a biased way, and the ascended polymaths currently out-thinking every expert on Earth

English

110

142

2.9K

184.2K

Tim Scarfe retweetledi

Machine Learning Street Talk@MLStreetTalk·10 Şub

New high-effort article "Why Creativity Cannot Be Interpolated" co-written with Dr. Jeremy Michael Budd. Yes the name is a pun on the famous book by @kenneth0stanley! The counterintuitive thesis (corollary of Kenneth's research): - Intelligence and agency are orthogonal to creativity - and sometimes actively hostile to it. - Genuine creativity is impossible without deep understanding and creativity without understanding is "slop". The strangest property of LLMs: within a single frame they seem to comprehend so deeply, yet they possess no perspective of their own. Like the blind men and elephant parable, each report is accurate, yet none integrates. We call this "frame-dependent" understanding, and it will change how you think about AI creativity. We started writing this 2 years ago, and this is our distilled understanding of AI creativity in 2026.

Machine Learning Street Talk tweet media

English

100

6.7K

Tim Scarfe retweetledi

Machine Learning Street Talk@MLStreetTalk·5 Şub

Excited about @sarahookr new startup @adaptionlabs, they just landed $50M in seed funding today! I've been looking up to Sara for many years now (since her Google Brain days) and she has always been one of the most coherent voices explaining why monolithic approaches to building LLMs marginalise the tails and average out everything else. The world is specialised, we speak different languages, we have different cultures, skills and industries and failing to represent this in our AI systems makes it superficial for everyone -- and counter-intuitively, makes AI less creative and coherent. Intelligence as I see it is adaptation efficiency -- we need to move past these massive frozen models and build AI systems that can adapt and learn continuously meeting folks where they are. From a technical perspective, this means abandoning the much vaunted "scale is all you need" hypothesis and possibly even abandoning gradient optimisation itself! We will be keeping a close eye on this project. Best of luck Sara!

adaption@adaption_ai

Adaption has raised $50M to build adaptive AI systems that evolve in real time. Everything intelligent adapts. So should AI.

English

8.3K

Tim Scarfe retweetledi

Machine Learning Street Talk@MLStreetTalk·4 Şub

Interesting research from Anthropic: When you have increasingly large models and increasingly complex tasks it's more likely that the models will give you different answers if you run the same query multiple times. On easy tasks, larger models actually become more coherent. Think of a "cone" of possible trajectories and the branching factor gets bigger with more possibilities (due the larger models "knowing more options to explore" and more complex problems having more "possible aspects"). The amount of time reasoning (trajectory length) then makes it multiplicatively more incoherent at the end state. Having a large model with an easy task means the correct answer is definitely "in there" and it's less likely to become distracted. They are arguing this is relevant for AI safety because some might have assumed that larger models would have convergent "instrumental goals" and would give a consistently wrong rather than randomly wrong answer. Apparently the "the hot mess theory of intelligence" (Sohl-Dickstein, 2023) argues that "as entities become more intelligent, their behaviour tends to become more incoherent, and less well described through a single goal."

Anthropic@AnthropicAI

New Anthropic Fellows research: How does misalignment scale with model intelligence and task complexity? When advanced AI fails, will it do so by pursuing the wrong goals? Or will it fail unpredictably and incoherently—like a "hot mess?" Read more: alignment.anthropic.com/2026/hot-mess-…

English

209

1.7K

170K

Tim Scarfe retweetledi

Tom Froese, Embodied Cognitive Science Unit (ECSU)@DrTomFroese·22 Haz

New video! 🤗 We discuss how theories of consciousness must meet the "participation criterion": Consciousness must make a measurable difference compared to its absence. But can this be conceived without reduction? How is difference made? youtu.be/c6SuW_JS1j8?si… via @YouTube

YouTube

English

2.1K

Tim Scarfe retweetledi

Prolific@Prolific·15 Oca

What if AI agents could call in the right expert at the right moment - built into the workflow? On @MLStreekTalk, @ecsquendor & @Phelimb discuss moving beyond simplistic evals to systems that reflect real users, real contexts, and real-world outcomes. 🎥 youtu.be/R11ESdfVX64?si…

YouTube

English

Tim Scarfe retweetledi

Kenneth Stanley@kenneth0stanley·23 Ara

So much equivocation about AGI these days boils down to pillars of open-endedness: discovery, invention, creativity, self-improvement, research, diversity of thought, long horizons, continual learning. Open-endedness has always been destined to be the climax of the AI saga.

English

116

9.1K

Tim Scarfe retweetledi

Justin Skycak@justinskycak·13 Ara

Douglas Hofstadter once wrote about what it felt like to max out his cognitive horsepower. Few people know this.

English

142

1.6K

134.4K

Tim Scarfe retweetledi

hardmaru@hardmaru·14 Ara

“Why AGI Will Not Happen” @Tim_Dettmers timdettmers.com/2025/12/10/why… This essay is worth reading. Discusses diminishing returns (and risks) of scaling. The contrast between West and East: “Winner Takes All” approach of building the biggest thing vs a long-term focus on practicality. “The purpose of this blog post is to address what I see as very sloppy thinking, thinking that is created in an echo chamber, particularly in the Bay Area, where the same ideas amplify themselves without critical awareness. This amplification of bad ideas and thinking exuded by the rationalist and EA movements, is a big problem in shaping a beneficial future for everyone.” “A key problem with ideas, particularly those coming from the Bay Area, is that they often live entirely in the idea space. Most people who think about AGI, superintelligence, scaling laws, and hardware improvements treat these concepts as abstract ideas that can be discussed like philosophical thought experiments. In fact, a lot of the thinking about superintelligence and AGI comes from Oxford-style philosophy. Oxford, the birthplace of effective altruism, mixed with the rationality culture from the Bay Area, gave rise to a strong distortion of how to clearly think about certain ideas.”

English

170

144.4K

Tim Scarfe retweetledi

François Chollet@fchollet·3 Ara

To perfectly understand a phenomenon is to perfectly compress it, to have a model of it that cannot be made any simpler. If a DL model requires millions parameters to model something that can be described by a differential equation of three terms, it has not really understood it, it has merely cached the data.

English

161

154

1.6K

122.6K

Tim Scarfe retweetledi

François Chollet@fchollet·3 Ara

There's a specific threshold of complexity and self-direction below which a system degenerates, and above which it can open-endedly self-improve. Current AI systems aren't close to it yet. But it's inevitable we will reach this point eventually. When we do, we won't see a sudden explosion, more like consistently self-sustaining linear-ish progress. Like the pace of Science itself (which itself clearly a self-improving system)

English

399

33.8K

Tim Scarfe retweetledi

Anil Ananthaswamy@anilananth·28 Kas

youtu.be/S31zEgHVkoA?

YouTube

ZXX

3.2K

Keşfet

@clattner_llvm @karpathy @kenneth0stanley @sarahookr @YouTube @Phelimb @Tim_Dettmers @elonmusk