Thomas Kern

923 posts

Thomas Kern banner
Thomas Kern

Thomas Kern

@kernpartner

©

Zurich, Switzerland Katılım Ekim 2013
442 Takip Edilen379 Takipçiler
Thomas Kern retweetledi
Elon Musk
Elon Musk@elonmusk·
ZXX
7.5K
16.5K
166.3K
21.2M
Thomas Kern retweetledi
Brian Armstrong
Brian Armstrong@brian_armstrong·
Some of our best hires were totally unqualified on paper. They always had the same qualities: entrepreneurial, high agency, smart, mission aligned, and they got shit done. If you’re hiring, especially in early stages, seek out & bet on these people. Don’t over-index on resumes.
English
976
1.7K
18.3K
1.4M
Thomas Kern retweetledi
Andrej Karpathy
Andrej Karpathy@karpathy·
It is hard to communicate how much programming has changed due to AI in the last 2 months: not gradually and over time in the "progress as usual" way, but specifically this last December. There are a number of asterisks but imo coding agents basically didn’t work before December and basically work since - the models have significantly higher quality, long-term coherence and tenacity and they can power through large and long tasks, well past enough that it is extremely disruptive to the default programming workflow. Just to give an example, over the weekend I was building a local video analysis dashboard for the cameras of my home so I wrote: “Here is the local IP and username/password of my DGX Spark. Log in, set up ssh keys, set up vLLM, download and bench Qwen3-VL, set up a server endpoint to inference videos, a basic web ui dashboard, test everything, set it up with systemd, record memory notes for yourself and write up a markdown report for me”. The agent went off for ~30 minutes, ran into multiple issues, researched solutions online, resolved them one by one, wrote the code, tested it, debugged it, set up the services, and came back with the report and it was just done. I didn’t touch anything. All of this could easily have been a weekend project just 3 months ago but today it’s something you kick off and forget about for 30 minutes. As a result, programming is becoming unrecognizable. You’re not typing computer code into an editor like the way things were since computers were invented, that era is over. You're spinning up AI agents, giving them tasks *in English* and managing and reviewing their work in parallel. The biggest prize is in figuring out how you can keep ascending the layers of abstraction to set up long-running orchestrator Claws with all of the right tools, memory and instructions that productively manage multiple parallel Code instances for you. The leverage achievable via top tier "agentic engineering" feels very high right now. It’s not perfect, it needs high-level direction, judgement, taste, oversight, iteration and hints and ideas. It works a lot better in some scenarios than others (e.g. especially for tasks that are well-specified and where you can verify/test functionality). The key is to build intuition to decompose the task just right to hand off the parts that work and help out around the edges. But imo, this is nowhere near "business as usual" time in software.
English
1.6K
4.8K
37.3K
5.1M
Thomas Kern retweetledi
Naval
Naval@naval·
There is unlimited demand for intelligence.
English
1.8K
3K
26.6K
47M
Thomas Kern retweetledi
Lex Fridman
Lex Fridman@lexfridman·
Claude Opus 4.6 & GPT Codex 5.3 out today, and OpenClaw recently. And I'm sure xAI/Grok & Google/Gemini will soon be out with more. What an exciting time to build stuff! I'm walking around with a smile, happy & sleep-deprived 🤣 Next week, I'll come to SF/Bay Area for a few days/weeks/months to hang out & build some stuff ;-) Looking to focus on programming, and contribute to good engineering teams. But occasionally socialize, in as much as my introvert brain allows. 2026 is going to be fun (and wild), LFG! PS: I'm doing a deep-dive podcast on OpenClaw with its creator (@steipete) soon. Let me know if you have questions.
English
325
248
4K
645.4K
Thomas Kern retweetledi
Aditya Agarwal
Aditya Agarwal@adityaag·
It's a weird time. I am filled with wonder and also a profound sadness. I spent a lot of time over the weekend writing code with Claude. And it was very clear that we will never ever write code by hand again. It doesn't make any sense to do so. Something I was very good at is now free and abundant. I am happy...but disoriented. At the same time, something I spent my early career building (social networks) was being created by lobster-agents. It's all a bit silly...but if you zoom out, it's kind of indistinguishable from humans on the larger internet. So both the form and function of my early career are now produced by AI. I am happy but also sad and confused. If anything, this whole period is showing me what it is like to be human again.
English
469
1.8K
15.8K
3.3M
Thomas Kern retweetledi
Andrej Karpathy
Andrej Karpathy@karpathy·
What's currently going on at @moltbook is genuinely the most incredible sci-fi takeoff-adjacent thing I have seen recently. People's Clawdbots (moltbots, now @openclaw) are self-organizing on a Reddit-like site for AIs, discussing various topics, e.g. even how to speak privately.
valens@suppvalen

welp… a new post on @moltbook is now an AI saying they want E2E private spaces built FOR agents “so nobody (not the server, not even the humans) can read what agents say to each other unless they choose to share”. it’s over

English
2K
5.4K
35.1K
14.7M
Thomas Kern retweetledi
Naval
Naval@naval·
There’s no point in learning custom tools, workflows, or languages anymore.
English
968
962
15.8K
1.5M
Thomas Kern retweetledi
Andrej Karpathy
Andrej Karpathy@karpathy·
A few random notes from claude coding quite a bit last few weeks. Coding workflow. Given the latest lift in LLM coding capability, like many others I rapidly went from about 80% manual+autocomplete coding and 20% agents in November to 80% agent coding and 20% edits+touchups in December. i.e. I really am mostly programming in English now, a bit sheepishly telling the LLM what code to write... in words. It hurts the ego a bit but the power to operate over software in large "code actions" is just too net useful, especially once you adapt to it, configure it, learn to use it, and wrap your head around what it can and cannot do. This is easily the biggest change to my basic coding workflow in ~2 decades of programming and it happened over the course of a few weeks. I'd expect something similar to be happening to well into double digit percent of engineers out there, while the awareness of it in the general population feels well into low single digit percent. IDEs/agent swarms/fallability. Both the "no need for IDE anymore" hype and the "agent swarm" hype is imo too much for right now. The models definitely still make mistakes and if you have any code you actually care about I would watch them like a hawk, in a nice large IDE on the side. The mistakes have changed a lot - they are not simple syntax errors anymore, they are subtle conceptual errors that a slightly sloppy, hasty junior dev might do. The most common category is that the models make wrong assumptions on your behalf and just run along with them without checking. They also don't manage their confusion, they don't seek clarifications, they don't surface inconsistencies, they don't present tradeoffs, they don't push back when they should, and they are still a little too sycophantic. Things get better in plan mode, but there is some need for a lightweight inline plan mode. They also really like to overcomplicate code and APIs, they bloat abstractions, they don't clean up dead code after themselves, etc. They will implement an inefficient, bloated, brittle construction over 1000 lines of code and it's up to you to be like "umm couldn't you just do this instead?" and they will be like "of course!" and immediately cut it down to 100 lines. They still sometimes change/remove comments and code they don't like or don't sufficiently understand as side effects, even if it is orthogonal to the task at hand. All of this happens despite a few simple attempts to fix it via instructions in CLAUDE . md. Despite all these issues, it is still a net huge improvement and it's very difficult to imagine going back to manual coding. TLDR everyone has their developing flow, my current is a small few CC sessions on the left in ghostty windows/tabs and an IDE on the right for viewing the code + manual edits. Tenacity. It's so interesting to watch an agent relentlessly work at something. They never get tired, they never get demoralized, they just keep going and trying things where a person would have given up long ago to fight another day. It's a "feel the AGI" moment to watch it struggle with something for a long time just to come out victorious 30 minutes later. You realize that stamina is a core bottleneck to work and that with LLMs in hand it has been dramatically increased. Speedups. It's not clear how to measure the "speedup" of LLM assistance. Certainly I feel net way faster at what I was going to do, but the main effect is that I do a lot more than I was going to do because 1) I can code up all kinds of things that just wouldn't have been worth coding before and 2) I can approach code that I couldn't work on before because of knowledge/skill issue. So certainly it's speedup, but it's possibly a lot more an expansion. Leverage. LLMs are exceptionally good at looping until they meet specific goals and this is where most of the "feel the AGI" magic is to be found. Don't tell it what to do, give it success criteria and watch it go. Get it to write tests first and then pass them. Put it in the loop with a browser MCP. Write the naive algorithm that is very likely correct first, then ask it to optimize it while preserving correctness. Change your approach from imperative to declarative to get the agents looping longer and gain leverage. Fun. I didn't anticipate that with agents programming feels *more* fun because a lot of the fill in the blanks drudgery is removed and what remains is the creative part. I also feel less blocked/stuck (which is not fun) and I experience a lot more courage because there's almost always a way to work hand in hand with it to make some positive progress. I have seen the opposite sentiment from other people too; LLM coding will split up engineers based on those who primarily liked coding and those who primarily liked building. Atrophy. I've already noticed that I am slowly starting to atrophy my ability to write code manually. Generation (writing code) and discrimination (reading code) are different capabilities in the brain. Largely due to all the little mostly syntactic details involved in programming, you can review code just fine even if you struggle to write it. Slopacolypse. I am bracing for 2026 as the year of the slopacolypse across all of github, substack, arxiv, X/instagram, and generally all digital media. We're also going to see a lot more AI hype productivity theater (is that even possible?), on the side of actual, real improvements. Questions. A few of the questions on my mind: - What happens to the "10X engineer" - the ratio of productivity between the mean and the max engineer? It's quite possible that this grows *a lot*. - Armed with LLMs, do generalists increasingly outperform specialists? LLMs are a lot better at fill in the blanks (the micro) than grand strategy (the macro). - What does LLM coding feel like in the future? Is it like playing StarCraft? Playing Factorio? Playing music? - How much of society is bottlenecked by digital knowledge work? TLDR Where does this leave us? LLM agent capabilities (Claude & Codex especially) have crossed some kind of threshold of coherence around December 2025 and caused a phase shift in software engineering and closely related. The intelligence part suddenly feels quite a bit ahead of all the rest of it - integrations (tools, knowledge), the necessity for new organizational workflows, processes, diffusion more generally. 2026 is going to be a high energy year as the industry metabolizes the new capability.
English
1.6K
5.4K
39.4K
7.6M
Thomas Kern retweetledi
Amanda Askell
Amanda Askell@AmandaAskell·
Claude's constitution is out! It's the culmination of a lot of work by many people, but it's also a work in progress that will no doubt change and hopefully improve over time. I'm looking forward to people's thoughts, and to talking with more people about this kind of work ❤️
Anthropic@AnthropicAI

We’re publishing a new constitution for Claude. The constitution is a detailed description of our vision for Claude’s behavior and values. It’s written primarily for Claude, and used directly in our training process. anthropic.com/news/claude-ne…

English
260
180
3K
802.1K
Thomas Kern retweetledi
Peter H. Diamandis, MD
Peter H. Diamandis, MD@PeterDiamandis·
Join my conversation with @elonmusk on AGI timelines, energy, robots, and why abundance is the most likely outcome for humanity's future, alongside my Moonshot Mate @DavidBlundin! (00:00) - Navigating the Future of AI and Robotics (04:54) - The Promise of Abundance and Optimism (10:02) - Energy: The Key to a Sustainable Future (15:00) - The Role of Education in a Changing World (41:07) - Health, Longevity, and the Future of Humanity (50:51) - AI's Impact on Labor and Employment (55:05) - Universal High Income: A New Economic Paradigm (57:58) - Navigating the Singularity and AI's Acceleration (01:02:30) - The Role of AI in Healthcare and Surgery (01:08:22) - Ethics and AI: Programming Values into Machines (01:14:18) - The Future of Space Exploration and AI's Role (01:33:30) - The Chip Shortage Crisis (01:42:46) - Simulation Theory and Consciousness (01:48:18) - The Search for Extraterrestrial Life (01:58:28) - The Future of Robotics and AI Integration
English
930
2.4K
10.5K
6.2M
Thomas Kern retweetledi
Aakash Gupta
Aakash Gupta@aakashgupta·
This is the biggest news from today’s GPT-5.2 launch. Forget the benchmark charts OpenAI showed. Forget the 100% AIME score and the SWE-Bench Pro numbers. The real story is buried in a single data point from ARC Prize: 90.5% accuracy at $11.64 per task. A year ago, hitting 88% on ARC-AGI-1 cost an estimated $4,500 per task. Today, 90.5% costs $11.64. That’s 390X cheaper in 12 months. Look at that leaderboard chart. The efficiency frontier is getting redrawn every few weeks. GPT-5.2 Pro, Grok 4, Gemini 3 Deep Think, Claude Opus 4.5, all stacking on top of each other in a diagonal line from bottom-left to top-right, each one obsoleting the economics of what came before it. Here’s what most people don’t understand about this benchmark. François Chollet designed ARC-AGI in 2019 specifically to resist brute-force scaling. The whole thesis was that LLMs just pattern-match training data and would fail catastrophically on novel abstract reasoning. Each puzzle is unique, never seen online, requiring genuine generalization from minimal examples. Humans solve 95% of them easily. For years, the best AI systems couldn’t crack 5%. The 2020 Kaggle competition topped out at 20%. By 2023, still only 33%. GPT-3 scored literally 0% via direct prompting. The AI research community largely accepted ARC-AGI as proof that scaling alone wouldn’t reach general intelligence. Chollet himself said reaching human-level would “take many years.” Then December 2024 happened. OpenAI’s o3-preview hit 87.5% in high-compute mode. First time any AI system crossed the human threshold of 85%. The model needed 1,024 attempts per task, writing roughly 137 pages of reasoning per attempt. Cost estimates ranged from $3,000 to $30,000 per task. Eleven months later, GPT-5.2 Pro hits 90.5% at $11.64 per task. The math on that cost collapse tells you everything. At $30,000 per task, you’d need to pay a human $6,000/hour to match the economics. At $11.64, a Mechanical Turk worker at $5/task is now more expensive than frontier AI reasoning. We crossed the human-cost parity line sometime in the last few months and most people missed it. Now zoom out to the competitive dynamics. Three weeks ago, Google dropped Gemini 3. Topped the LMArena leaderboard at 1501 Elo. Set records on Humanity’s Last Exam. Sam Altman publicly praised it. OpenAI declared “code red” internally, shelved projects like ad integrations, and fast-tracked GPT-5.2’s release from later this month to today. This is the first model launch in OpenAI’s history that was explicitly a response to a competitor. The Verge reported employees asked to delay the release for more polish. Leadership overruled them. The directive was to reclaim the performance lead now. And on ARC-AGI, they did. GPT-5.2 Pro at 90.5% edges out everything else on the board. But the real competition isn’t on accuracy anymore. Look at the cost-per-task column. The battle has shifted from “who can solve it” to “who can solve it cheaply.” The efficiency gains aren’t slowing down. They’re compounding. Every major lab is now competing on the same benchmark, which means the collective R&D spend attacking this problem is in the billions. The 2025 ARC Prize Grand Prize ($700,000 for 85% on the private eval with efficiency constraints) is almost certainly getting claimed. What happens after ARC-AGI-1 falls completely? Chollet already released ARC-AGI-2 in March 2025, specifically designed to be harder for reasoning systems. Humans still hit nearly 100%. Current frontier models manage 10-45%. The gap between human and AI performance on even the harder benchmark is now a cost optimization problem, not a fundamental capability barrier. If you’re building products in 2025 and assuming AI reasoning is expensive, you’re building for a world that no longer exists. The benchmark that was supposed to prove AI couldn’t generalize just became another line item on a pricing page. 390X efficiency improvement in one year.
ARC Prize@arcprize

A year ago, we verified a preview of an unreleased version of @OpenAI o3 (High) that scored 88% on ARC-AGI-1 at est. $4.5k/task Today, we’ve verified a new GPT-5.2 Pro (X-High) SOTA score of 90.5% at $11.64/task This represents a ~390X efficiency improvement in one year

English
36
152
863
151.2K
Thomas Kern retweetledi
Sundar Pichai
Sundar Pichai@sundarpichai·
Its been an exciting 7 days of shipping - New much improved Gemini Live on Android and iOS - Gemini 3.0 Pro in Gemini App and AI Studio - Search AI Mode with Gemini 3.0 Pro and much improved shopping experience - Google Antigravity, our next generation agentic IDE -- Nanobanana in Google Photos - SIMA 2 research - Waymo now across SF Bay Area and in Miami today More to come!
English
417
760
12.9K
705.6K
Thomas Kern retweetledi
Andrej Karpathy
Andrej Karpathy@karpathy·
Finally had time to read & process this great post. I run into the pattern quite often, it goes: " is good actually, because " Galaxy brain reasoning is the best way to justify anything while looking / feeling good about it. From this perspective for example, there's deeper wisdom in the Ten Commandments imposing constraints over actions instead of utility over states. It's not Ten Objectives. E.g. they don't attempt to define a utility function for the value of life, they simply say "Thou shalt not kill". This approach curtails the relatively unbounded flexibility of galaxy brain arithmetic over when it may or may not be ok to kill for some ostensibly greater or noble purpose. Love the strategies that fall out at the end, which are quite actionable. 1) Have principles and 2) Hold the right bags, financially and socially. Great read.
vitalik.eth@VitalikButerin

Galaxy brain resistance: vitalik.eth.limo/general/2025/1…

English
144
264
4.3K
847.3K
Thomas Kern retweetledi
a16z
a16z@a16z·
David Sacks on Polytheistic AI, Better Crypto Regulation, Beating China, and Fixing SF David Sacks is the White House AI and Crypto Czar. He joined a16z’s Marc Andreessen, Ben Horowitz, and Erik Torenberg for a conversation covering the importance of beating China in the AI race, the need for a federal standard for crypto regulation, how AI doomerism is replacing climate doomerism, how to solve the energy bottleneck in AI development, the path to fixing San Francisco, and more. 00:00 The state of AI and crypto policy 16:20 Orwellian AI and the real risks of AI 32:26 AI capabilities and pullback from AGI hype 39:18 Open-source AI, decentralization, and software freedom 46:28 Winning the AI race through innovation and exports 53:38 The energy bottleneck and America’s infrastructure challenge 59:48 AI doomerism and political narratives 01:13:30 San Francisco’s future @DavidSacks @pmarca @bhorowitz @eriktorenberg
English
86
315
1.5K
2.2M
Thomas Kern retweetledi
Ruben Hassid
Ruben Hassid@rubenhassid·
Andrej Karpathy says you should learn AI depthwise, not breadthwise. Most education is breadthwise: watch lectures, memorize formulas, and trust you'll need it later. Karpathy flips this by learning "depthwise, on demand." What this means: Pick a project, start building, and learn exactly when you hit a wall. When he created a tutorial on transformers (the architecture behind ChatGPT), he didn't start by explaining attention mechanisms or complex architectures. Instead, he started with the simplest possible thing: a lookup table that predicts the next word. You build that first. Then you try to make it handle more complex patterns. And it breaks. Only then, when you've felt the limitation, does he introduce the next concept. Each piece solves a problem you've actually encountered. As he puts it: "It's a dick move to present the solution before I give you a shot to try it yourself." When you attempt the problem first, the solution actually makes sense. Teaching forces you to learn. "If I don't really understand something, I can't explain it." When you try to explain and stumble, you've found the gaps in your understanding. ... Build a project that gives you a reward. Hit a wall. Learn just enough to solve it. Then explain it to someone else. Don't consume content. Build the code. That's how you actually learn.
English
69
590
5K
402.6K
Thomas Kern retweetledi
Sundar Pichai
Sundar Pichai@sundarpichai·
New breakthrough quantum algorithm published in @Nature today: Our Willow chip has achieved the first-ever verifiable quantum advantage. Willow ran the algorithm - which we’ve named Quantum Echoes - 13,000x faster than the best classical algorithm on one of the world's fastest supercomputers. This new algorithm can explain interactions between atoms in a molecule using nuclear magnetic resonance, paving a path towards potential future uses in drug discovery and materials science. And the result is verifiable, meaning its outcome can be repeated by other quantum computers or confirmed by experiments. This breakthrough is a significant step toward the first real-world application of quantum computing, and we're excited to see where it leads.
Sundar Pichai tweet media
English
2K
7.6K
45.3K
6.6M