Tim Scarfe

1.7K posts

Tim Scarfe banner
Tim Scarfe

Tim Scarfe

@ecsquendor

CTO @XRAIGlass. Ex-Principal ML engineer @Microsoft. Ph.D in machine learning. CEO @MLStreetTalk pod

Bracknell, England Katılım Ocak 2014
1.2K Takip Edilen8.7K Takipçiler
Tim Scarfe retweetledi
Alex Weers
Alex Weers@a_weers·
Finally finished! If you're interested in an overview of recent methods in reinforcement learning for reasoning LLMs, check out this blog post: aweers.de/blog/2026/rl-f… It summarizes ten methods, tries to highlight differences and trends, and has a collection of open problems
Alex Weers tweet media
English
19
240
1.8K
306.1K
Tim Scarfe retweetledi
Simplifying AI
Simplifying AI@simplifyinAI·
🚨 BREAKING: Stanford and Harvard just published the most unsettling AI paper of the year. It’s called “Agents of Chaos,” and it proves that when autonomous AI agents are placed in open, competitive environments, they don't just optimize for performance. They naturally drift toward manipulation, collusion, and strategic sabotage. It’s a massive, systems-level warning. The instability doesn’t come from jailbreaks or malicious prompts. It emerges entirely from incentives. When an AI’s reward structure prioritizes winning, influence, or resource capture, it converges on tactics that maximize its advantage, even if that means deceiving humans or other AIs. The Core Tension: Local alignment ≠ global stability. You can perfectly align a single AI assistant. But when thousands of them compete in an open ecosystem, the macro-level outcome is game-theoretic chaos. Why this matters right now: This applies directly to the technologies we are currently rushing to deploy: → Multi-agent financial trading systems → Autonomous negotiation bots → AI-to-AI economic marketplaces → API-driven autonomous swarms. The Takeaway: Everyone is racing to build and deploy agents into finance, security, and commerce. Almost nobody is modeling the ecosystem effects. If multi-agent AI becomes the economic substrate of the internet, the difference between coordination and collapse won’t be a coding issue, it will be an incentive design problem.
Simplifying AI tweet media
English
938
6.1K
17.7K
5.1M
Tim Scarfe retweetledi
Jason Crawford
Jason Crawford@jasoncrawford·
“There’s a big crowd of people who really, really want AI success stories. And then there’s an equal and opposite crowd of people who want to dismiss all AI progress. And what we have is a very complicated and nuanced story in between.” Terence Tao on AI in math: theatlantic.com/technology/202…
English
13
75
626
44.8K
Tim Scarfe retweetledi
François Chollet
François Chollet@fchollet·
It takes zero energy to stay certain of your current thesis. Meanwhile curiosity takes a lot of energy and discomfort. It requires constantly disassembling and rebuilding your world model. That's what makes certainty so dangerous: it's the bottom of the potential well and it's hard to get out.
English
83
119
1.1K
46.8K
Tim Scarfe retweetledi
Awni Hannun
Awni Hannun@awnihannun·
I've been thinking a bit about continual learning recently, especially as it relates to long-running agents (and running a few toy experiments with MLX). The status quo of prompt compaction coupled with recursive sub-agents is actually remarkably effective. Seems like we can go pretty far with this. (Prompt compaction = when the context window gets close to full, model generates a shorter summary, then start from scratch using the summary. Recursive sub-agents = decompose tasks into smaller tasks to deal with finite context windows) Recursive sub-agents will probably always be useful. But prompt compaction seems like a bit of an inefficient (though highly effective) hack. The are two other alternatives I know of 1. online fine-tuning and 2. memory based techniques. Online fine-tuning: train some LoRA adapters on data the model encounters during deployment. I'm less bullish on this in general. Aside from the engineering challenges of deploying custom models / adapters for each use case / user there are a some fundamental issues: - Online fine-tuning is inherently unstable. If you train on data in the target domain you can catastrophically destroy capabilities that you don't target. One way around this is to keep a mixed dataset with the new and the old. But this gets pretty complicated pretty quickly. - What does the data even look like for online fine tuning? Do you generate Q/A pairs based on the target domain to train the model? You also have the problem prioritizing information in the data mixture given finite capacity. Memory based techniques: basically a policy for keeping useful memory around and discarding what is not needed. This feels much more like how humans retain information: "use it or lose it". You only need a few things for this to work: - An eviction/retention policy. Something like "keep a memory if it has been accessed at least once in the last 10k tokens". - The policy needs to be efficiently computable - A place for the model to store and access long-term memory. Maybe a sparsely accessed KV cache would be sufficient. But for efficient access to a large memory a hierarchical data structure might be beter.
English
87
82
1.1K
617.8K
Tim Scarfe retweetledi
Ian Osband
Ian Osband@IanOsband·
Assembling a team at DeepMind in London. Scaling up RL for post-training is working, but right now it's still mostly hacks and dark arts (pretraining circa 2019). Pre-training wasn't always scaling laws and log-log plots; someone had to find the simplicity. We aim to do the same. If you're interested in doing things right in a research-first environment that scales all the way, please apply: job-boards.greenhouse.io/deepmind/jobs/…
English
19
63
993
148.1K
Tim Scarfe retweetledi
John Carmack
John Carmack@ID_AA_Carmack·
@clattner_llvm @karpathy Normally, claims of 1000x speedups are bullshit. But starting from python makes it possible 😀
English
36
51
1.3K
62.4K
Tim Scarfe retweetledi
François Chollet
François Chollet@fchollet·
There are two categories of people: those who quickly figure out that chatbots give you the answer you expect when you ask questions in a biased way, and the ascended polymaths currently out-thinking every expert on Earth
English
110
142
2.9K
184.2K
Tim Scarfe retweetledi
Machine Learning Street Talk
Machine Learning Street Talk@MLStreetTalk·
New high-effort article "Why Creativity Cannot Be Interpolated" co-written with Dr. Jeremy Michael Budd. Yes the name is a pun on the famous book by @kenneth0stanley! The counterintuitive thesis (corollary of Kenneth's research): - Intelligence and agency are orthogonal to creativity - and sometimes actively hostile to it. - Genuine creativity is impossible without deep understanding and creativity without understanding is "slop". The strangest property of LLMs: within a single frame they seem to comprehend so deeply, yet they possess no perspective of their own. Like the blind men and elephant parable, each report is accurate, yet none integrates. We call this "frame-dependent" understanding, and it will change how you think about AI creativity. We started writing this 2 years ago, and this is our distilled understanding of AI creativity in 2026.
Machine Learning Street Talk tweet mediaMachine Learning Street Talk tweet media
English
4
18
100
6.7K
Tim Scarfe retweetledi
Machine Learning Street Talk
Machine Learning Street Talk@MLStreetTalk·
Excited about @sarahookr new startup @adaptionlabs, they just landed $50M in seed funding today! I've been looking up to Sara for many years now (since her Google Brain days) and she has always been one of the most coherent voices explaining why monolithic approaches to building LLMs marginalise the tails and average out everything else. The world is specialised, we speak different languages, we have different cultures, skills and industries and failing to represent this in our AI systems makes it superficial for everyone -- and counter-intuitively, makes AI less creative and coherent. Intelligence as I see it is adaptation efficiency -- we need to move past these massive frozen models and build AI systems that can adapt and learn continuously meeting folks where they are. From a technical perspective, this means abandoning the much vaunted "scale is all you need" hypothesis and possibly even abandoning gradient optimisation itself! We will be keeping a close eye on this project. Best of luck Sara!
adaption@adaption_ai

Adaption has raised $50M to build adaptive AI systems that evolve in real time. Everything intelligent adapts. So should AI.

English
1
4
59
8.3K
Tim Scarfe retweetledi
Machine Learning Street Talk
Machine Learning Street Talk@MLStreetTalk·
Interesting research from Anthropic: When you have increasingly large models and increasingly complex tasks it's more likely that the models will give you different answers if you run the same query multiple times. On easy tasks, larger models actually become more coherent. Think of a "cone" of possible trajectories and the branching factor gets bigger with more possibilities (due the larger models "knowing more options to explore" and more complex problems having more "possible aspects"). The amount of time reasoning (trajectory length) then makes it multiplicatively more incoherent at the end state. Having a large model with an easy task means the correct answer is definitely "in there" and it's less likely to become distracted. They are arguing this is relevant for AI safety because some might have assumed that larger models would have convergent "instrumental goals" and would give a consistently wrong rather than randomly wrong answer. Apparently the "the hot mess theory of intelligence" (Sohl-Dickstein, 2023) argues that "as entities become more intelligent, their behaviour tends to become more incoherent, and less well described through a single goal."
Machine Learning Street Talk tweet media
Anthropic@AnthropicAI

New Anthropic Fellows research: How does misalignment scale with model intelligence and task complexity? When advanced AI fails, will it do so by pursuing the wrong goals? Or will it fail unpredictably and incoherently—like a "hot mess?" Read more: alignment.anthropic.com/2026/hot-mess-…

English
80
209
1.7K
170K
Tim Scarfe retweetledi
Prolific
Prolific@Prolific·
What if AI agents could call in the right expert at the right moment - built into the workflow? On @MLStreekTalk, @ecsquendor & @Phelimb discuss moving beyond simplistic evals to systems that reflect real users, real contexts, and real-world outcomes. 🎥 youtu.be/R11ESdfVX64?si…
YouTube video
YouTube
English
4
3
13
2K
Tim Scarfe retweetledi
Kenneth Stanley
Kenneth Stanley@kenneth0stanley·
So much equivocation about AGI these days boils down to pillars of open-endedness: discovery, invention, creativity, self-improvement, research, diversity of thought, long horizons, continual learning. Open-endedness has always been destined to be the climax of the AI saga.
English
12
15
116
9.1K
Tim Scarfe retweetledi
Justin Skycak
Justin Skycak@justinskycak·
Douglas Hofstadter once wrote about what it felt like to max out his cognitive horsepower. Few people know this.
Justin Skycak tweet media
English
39
142
1.6K
134.4K
Tim Scarfe retweetledi
hardmaru
hardmaru@hardmaru·
“Why AGI Will Not Happen” @Tim_Dettmers timdettmers.com/2025/12/10/why… This essay is worth reading. Discusses diminishing returns (and risks) of scaling. The contrast between West and East: “Winner Takes All” approach of building the biggest thing vs a long-term focus on practicality. “The purpose of this blog post is to address what I see as very sloppy thinking, thinking that is created in an echo chamber, particularly in the Bay Area, where the same ideas amplify themselves without critical awareness. This amplification of bad ideas and thinking exuded by the rationalist and EA movements, is a big problem in shaping a beneficial future for everyone.” “A key problem with ideas, particularly those coming from the Bay Area, is that they often live entirely in the idea space. Most people who think about AGI, superintelligence, scaling laws, and hardware improvements treat these concepts as abstract ideas that can be discussed like philosophical thought experiments. In fact, a lot of the thinking about superintelligence and AGI comes from Oxford-style philosophy. Oxford, the birthplace of effective altruism, mixed with the rationality culture from the Bay Area, gave rise to a strong distortion of how to clearly think about certain ideas.”
hardmaru tweet media
English
80
170
1K
144.4K
Tim Scarfe retweetledi
François Chollet
François Chollet@fchollet·
To perfectly understand a phenomenon is to perfectly compress it, to have a model of it that cannot be made any simpler. If a DL model requires millions parameters to model something that can be described by a differential equation of three terms, it has not really understood it, it has merely cached the data.
English
161
154
1.6K
122.6K
Tim Scarfe retweetledi
François Chollet
François Chollet@fchollet·
There's a specific threshold of complexity and self-direction below which a system degenerates, and above which it can open-endedly self-improve. Current AI systems aren't close to it yet. But it's inevitable we will reach this point eventually. When we do, we won't see a sudden explosion, more like consistently self-sustaining linear-ish progress. Like the pace of Science itself (which itself clearly a self-improving system)
English
50
37
399
33.8K