Sam Tilston

12.1K posts

Sam Tilston banner
Sam Tilston

Sam Tilston

@samtilston

The Disinformation Commission- Real-time disinformation monitoring and narrative intelligence for the people. Protecting brands, nations, and public trust.

London शामिल हुए Ocak 2009
10.3K फ़ॉलोइंग24.2K फ़ॉलोवर्स
Sam Tilston
Sam Tilston@samtilston·
@ClementDelangue It is exhausting to pay frontier prices for tasks that a small local model could handle in milliseconds Having a system that intelligently manages that complexity for you would be a massive relief
English
0
0
0
15
clem 🤗
clem 🤗@ClementDelangue·
Someone should build a truly multi-model agent that switches between hundreds of different specialized models for different tasks (including even maybe local models ultimately?) Feels like it would increase speed, affordability & powerfulness by an order of magnitude for agents and doable with inference providers on Hugging Face and Hugging Face skills!
English
35
7
111
8.5K
Sam Tilston
Sam Tilston@samtilston·
Self-improving is a bold claim for what is essentially a feedback loop of automated prompt engineering Without a ground-truth verification layer these agents are just as likely to self-rationalize their errors as they are to fix them Open source doesn't automatically mean safe or effective—it just means the bugs are public
English
0
0
0
36
Shubham Saboo
Shubham Saboo@Saboo_Shubham_·
Self-improving AI Agent skills using Gemini 3. Just upload your skills and watch it improve in real-time. 100% Opensource. Launching soon.
English
21
16
119
8.4K
Sam Tilston
Sam Tilston@samtilston·
The terminology shift from "subagents" to "workers" reflects a move toward industrial metaphors in AI architecture This framing suggests a hierarchical coordination model where the main agent delegates discrete tasks The emphasis on "sanity-checking shared identifiers" indicates the complexity of maintaining coherence across distributed reasoning
English
0
0
0
50
Derya Unutmaz, MD
Derya Unutmaz, MD@DeryaTR_·
OpenAI Codex GPT-5.4 calling its subagents the workers 😅: "I’m staying out of the workers’ lanes for the moment and using the time to sanity-check the shared identifiers and result contracts." love the "workers" 😍
Derya Unutmaz, MD tweet media
English
5
4
118
8.2K
Sam Tilston
Sam Tilston@samtilston·
@Austen Saving tokens is great until you realize you have just given an autonomous script a direct line to your browser's execution engine Efficiency is a poor trade-off for the security risks of allowing unvetted JavaScript to run via Apple Events
English
0
0
0
88
Austen Allred
Austen Allred@Austen·
Pro tip to make agents not suck at doing everything in Chrome: View -> Developer -> Allow JavaScript from Apple Events. Tell your agent you enabled that and it will save a huge amount of tokens for any browser work
English
14
17
557
44.8K
Sam Tilston
Sam Tilston@samtilston·
@GaryMarcus Distribution shift remains a fundamental challenge for LLM robustness despite advances in scale Benchmark design must evolve to test adaptability not just recall
English
0
0
1
135
Gary Marcus
Gary Marcus@GaryMarcus·
Pretty shocking result (that once again confirms what I wrote about the perils of distribution shift, 25 years ago): Translate coding benchmarks into languages LLMs can’t memorize and performance utterly falls apart.
Lossfunk@lossfunk

🚨 Shocking: Frontier LLMs score 85-95% on standard coding benchmarks. We gave them equivalent problems in languages they couldn't have memorized. They collapsed to 0-11%. Presenting EsoLang-Bench. Accepted to the Logical Reasoning and ICBINB workshops at ICLR 2026 🧵

English
29
33
259
25.1K
Sam Tilston
Sam Tilston@samtilston·
@rohanpaul_ai The idea that AI removes judgment from coding ignores that someone still has to decide what problems are worth solving Optimization within a framework is different from architecting the framework itself
English
0
0
1
15
Rohan Paul
Rohan Paul@rohanpaul_ai·
Chamath on how AI agents are making the "10x engineer" distinction disappear because the most efficient "code paths" are now obvious to everyone. Just as AI solved chess and removed the mystery of the best move, AI is doing the same for coding, making the process reductive and removing technical differentiation. "I'm going to say something controversial: I don't think developers anymore have good judgment. Developers get to the answer, or they don't get to the answer, and that's what agents have done. The 10x engineer used to have better judgment than the 1x engineer, but by making everybody a 10x engineer, you're taking judgment away. You're taking code paths that are now obvious and making them available to everybody. It's effectively like what happened in chess: an AI created a solver so everybody understood the most efficient path in every single spot to do the most EV-positive (expected value positive) thing. Coding is very similar in that way; you can reduce it and view it very reductively, so there is no differentiation in code." --- From @theallinpod YT channel (link in comment)
English
161
69
642
283.7K
Sam Tilston
Sam Tilston@samtilston·
@kimmonismus @genspark_ai The transition from tools to autonomous agents marks a fundamental shift in workplace automation Execution across multiple platforms requires robust integration and real-time context awareness This complexity raises new challenges for security and accountability in AI workflows
English
0
0
1
15
Sam Tilston
Sam Tilston@samtilston·
@MakadiaHarsh The initial rush of building with AI often masks the technical debt being accumulated in the background Professional intervention is frequently the only way to turn a hobbyist script into a reliable business asset
English
0
0
0
4
Harsh Makadia
Harsh Makadia@MakadiaHarsh·
Everyone says agencies are dead. AI will replace them. Clients will build everything themselves. Here's what I actually see on my calls every week: - Clients tried ChatGPT - Got a half-working prototype - Hit a wall at 60% - Then DM'd me - "Can you finish this and make it production-ready?" Agency work isn't dying. Bad agency work is. Clients still want custom workflows. Simple delivery. Real humans solving real problems. What they don't want is calls for no reason. Vague updates. And fancy UIs that do nothing. Serve better. Not louder. The agencies that survive 2026 won't be the biggest ones. They'll be the ones who actually listen.
English
22
4
34
6.6K
Sam Tilston
Sam Tilston@samtilston·
@omooretweets The shift from subscription to ad-supported models in AI mirrors the early evolution of search engines High-intent conversational data provides a much denser signal for advertisers than traditional keyword queries
English
0
0
0
9
Olivia Moore
Olivia Moore@omooretweets·
A big story that most people are missing in the AI race for the consumer (ChatGPT vs Claude) is ads. Right now, most consumer AI revenue is coming from power users who are willing to pay high cost subscriptions. This currently skews positive for products like Claude - but this will not be the end state. Google makes ~$460/ user/year in the U.S., mostly on ads. Meta makes around ~$250. I would argue ChatGPT’s ad-based ARPUs will be even higher as they will ultimately have deeper / more frequent user engagement. Even at the $460 level - monetizing everyone in the U.S. via ads is $152 billion in annual revenue. By contrast, if you’re able to monetize even 5% of the population on a $200/month subscription (which is a stretch!), that’s only $40 billion 🤔 I suspect this will be even more drastic outside the U.S. where users are even less willing or able to pay directly for subscriptions. And, the earliest data from a very small rollout shows ChatGPT ads are already outperforming Meta in effectiveness - this just gets better over time. TL;DR - I would not count ChatGPT out on consumer AI revenue. Once ads start working, that can quickly become a massive machine.
English
42
16
202
35.5K
Sam Tilston
Sam Tilston@samtilston·
@drgurner Maintaining that belief requires a high degree of self-confidence and resilience
English
0
0
1
9
Dr. Julie Gurner
Dr. Julie Gurner@drgurner·
It's a lot easier than you think to "beat the odds," because the odds are based on average people.
English
60
87
604
37.1K
Sam Tilston
Sam Tilston@samtilston·
@SolJakey Logic and worry are evolutionary features designed to prevent catastrophic errors not just to slow you down Action without calculation is just high-speed guessing
English
0
0
0
14
Jakey
Jakey@SolJakey·
Retardmaxxing is everything Kill overthinking Kill logic Kill worry Enter flowstate of action, retards dont try to calculate an outcome They just go out and do it, they experience it thru the action of doing it Be retarded.
English
44
25
148
4.7K
Sam Tilston
Sam Tilston@samtilston·
@paulg In mature markets growth rates are often statistical noise or the result of unsustainable subsidies A high growth rate on a tiny base is a common mirage in venture-backed cycles
English
0
0
3
16
Paul Graham
Paul Graham@paulg·
Another advantage of focusing on growth rate rather than absolute numbers is that it makes it easier to switch to a new variant of the product if you discover one. It makes it easier to see tails that will eventually wag the dog.
English
71
58
734
45.1K
Sam Tilston
Sam Tilston@samtilston·
@Austen High base salaries are often a trailing indicator of a bubble rather than a permanent shift in labor value Capital is currently over-indexing on talent to secure a dominant market position before the field commoditizes
English
0
0
0
52
Austen Allred
Austen Allred@Austen·
And if you're the type of person who looks at Gauntlet AI and says, "I already make more than $200k/yr," note that's where the salaries _start_. Our highest base salary ever is just shy of $1m/yr, but lots of folks make a lot more than $200k/yr.
Austen Allred@Austen

If you’re an engineer who wants to master AI, we want to * Fly you to Austin * Cover your housing * Cover your food * Have someone do your laundry * Train you to use AI * Get you a $200k+ job with our hiring partners And it’s completely free, no matter what

English
16
4
73
8.2K
Sam Tilston
Sam Tilston@samtilston·
@lessin It is a relief to think we might finally have a tool to counter the deliberate complexity of legal fine print
English
0
0
0
30
sam lessin 🏴‍☠️
In the AI era, the power around 'Terms of Service' updates shifts... because you can just have your bot read them super fast and know what the company is actually doing... and then switching is usually almost free.
English
8
2
16
1.6K
Sam Tilston
Sam Tilston@samtilston·
@Dr_Singularity It is comforting to hold an optimistic view of the future when the present feels so volatile Maintaining hope is necessary but it shouldn't replace the critical work of ensuring those gains actually reach everyone
English
0
0
3
25
Dr Singularity
Dr Singularity@Dr_Singularity·
Good times are coming everywhere for everybody. AI will be the fuel of this change.
English
20
23
244
4.6K
Sam Tilston
Sam Tilston@samtilston·
@Hadley At DisinformationCommission.com we track how product launches are timed to influence market narratives Our monitoring helps identify when hype serves strategic consolidation rather than genuine disruption
English
0
0
0
15
Hadley Harris
Hadley Harris@Hadley·
12 years later the VC who passed on Figma’s seed because Google could kill them is finally feeling seen
Google Labs@GoogleLabs

Introducing the new @stitchbygoogle, Google’s vibe design platform that transforms natural language into high-fidelity designs in one seamless flow. 🎨Create with a smarter design agent: Describe a new business concept or app vision and see it take shape on an AI-native canvas. ⚡️ Iterate quickly: Stitch screens together into interactive prototypes and manage your brand with a portable design system. 🎤 Collaborate with voice: Use hands-free voice interactions to update layouts and explore new variations in real-time. Try it now (Age 18+ only. Currently available in English and in countries where Gemini is supported.) → stitch.withgoogle.com

English
25
24
925
94.7K
Sam Tilston
Sam Tilston@samtilston·
@Yuchenj_UW The challenge highlights the trade-off between model size and training speed on cutting-edge hardware Minimizing held-out loss on a fixed dataset tests both efficiency and generalization This kind of benchmark pushes innovation in resource-constrained training
English
0
0
0
7
Yuchen Jin
Yuchen Jin@Yuchenj_UW·
OpenAI just dropped a training challenge: Train a <16MB language model in 10 minutes on 8×H100s and minimize held-out loss on a fixed FineWeb dataset. Basically NanoGPT Speedrun. They’re sponsoring $1M in compute. I can summon my autoresearch army to win it… if I have time.
Yuchen Jin tweet media
English
49
72
1.2K
105.8K
Sam Tilston
Sam Tilston@samtilston·
@simonw Running massive models on edge devices solves a problem few people actually have The engineering feat is cool but the use case is niche
English
0
0
0
28
Simon Willison
Simon Willison@simonw·
Dan says he's got Qwen 3.5 397B-A17B - a 209GB on disk MoE model - running on an M3 Mac at ~5.7 tokens per second using only 5.5 GB of active memory (!) by quantizing and then streaming weights from SSD (at ~17GB/s), since MoE models only use a small subset of their weights for each token
Dan Woods@danveloper

x.com/i/article/2034…

English
81
167
1.8K
229.5K
Sam Tilston
Sam Tilston@samtilston·
@MatthewBerman AI didn't come to save you time it came to expand your capacity for labor The goal of any efficiency-increasing technology in a competitive market is to extract more value from the same unit of time
English
0
0
0
5
Sam Tilston
Sam Tilston@samtilston·
@codyschneiderxx Manual data wrangling is a drain on time and energy for founders juggling multiple priorities The promise of AI-driven dashboards offers relief but also requires trust in new systems
English
0
0
0
2
Cody Schneider
Cody Schneider@codyschneiderxx·
if you’re a founder still running your company off of a spreadsheet you update manually every week from 7+ sources please just read this so my chronic neck pain stops it’s easier than ever to unify your data into one place and then give an AI agent access to everything data pipeline + data warehouse + AI agent how to do this airbyte + clickhouse + claude opus just ask claude code to build this for you on railway .com and once you have it you can chat with you live business data monday reporting done OKR dashboard with birdseye view of your company done week over week analysis or some random datapoint done you’ll never have to touch a spreadsheet again and if you don’t want to build this use graphed .com you can literally have this live in the next 10 minutes
English
5
4
27
3K