Hafsteinn
1.3K posts

Hafsteinn
@hafsteinn
Associate Professor in CS at the University of Iceland, research scientist at deCODE (views here are my own) 🇮🇸🏳️🌈, he/him.

We raised $500M at an $11B valuation to transform how people interact with technology.

I truly love the singularity



You really don’t realize how much of your behavior is driven by serotonin until you take cyproheptadine


Sora has an apt Japanese meaning, given the context




Should I get a gene therapy done to become hyper-muscular with zero effort?


After thinking about this problem for months, I am so happy to finally introduce DetailBench! It answers a simple question: How good are current LLMs at finding small errors, when they are *not* explicitly asked to do so? (Yes, the graph is right!)

I’m a psychiatrist. In 2025, I’ve seen 12 people hospitalized after losing touch with reality because of AI. Online, I’m seeing the same pattern. Here’s what “AI psychosis” looks like, and why it’s spreading fast: 🧵

I'm noticing that due to (I think?) a lot of benchmarkmaxxing on long horizon tasks, LLMs are becoming a little too agentic by default, a little beyond my average use case. For example in coding, the models now tend to reason for a fairly long time, they have an inclination to start listing and grepping files all across the entire repo, they do repeated web searchers, they over-analyze and over-think little rare edge cases even in code that is knowingly incomplete and under active development, and often come back ~minutes later even for simple queries. This might make sense for long-running tasks but it's less of a good fit for more "in the loop" iterated development that I still do a lot of, or if I'm just looking for a quick spot check before running a script, just in case I got some indexing wrong or made some dumb error. So I find myself quite often stopping the LLMs with variations of "Stop, you're way overthinking this. Look at only this single file. Do not use any tools. Do not over-engineer", etc. Basically as the default starts to slowly creep into the "ultrathink" super agentic mode, I feel a need for the reverse, and more generally good ways to indicate or communicate intent / stakes, from "just have a quick look" all the way to "go off for 30 minutes, come back when absolutely certain".

In Claude Code, the difference between using planning mode and actually spending 15 minutes to plan with the agent in a plan.md file results in a task that runs in 15 minutes vs a task that can run for about an hour without intervention.









