Lon()
2.1K posts

Lon()
@Lon
Absurdist intern. Exquisite shitpoasting. High-school dropout + teenage dad. Failed angel investor. EP on Gary Busey film. SIGMOD winner. Shipped infra you use.




Nobody cares or needs to hear this from me, but I’m just registering my opinion that: 1) LLMs are a totally ordinary technology. But so were cars. Ordinary technologies can have big impacts. 2) They are *very obviously* not reasoning and the way that smart people specifically trick themselves on this point is critical to understanding many things about the world.










GitHub needs a reputation system. And a way to exclude people who have massive activity upticks around last November. If you are a new contributor and your first PR looks at all AI, it will just be closed. Not spending human time reading it.






Anthropic tried to kill 8,100 GitHub repos. Then this happened > They filed a DMCA. GitHub nuked the entire network within hours. Developers got notices for forks of Anthropic's OWN public repo - one guy's fork had zero leaked code. > Boris Cherny, head of Claude Code, had to go on X personally: "This was not intentional. Should be better now." > Meanwhile Sigrid Jin - who used 25 billion Claude Code tokens last year - woke up at 4AM and rewrote the entire thing in Python before sunrise. DMCA can't touch a clean-room rewrite. > It hit 50K stars in 2 hours. Fastest repo in GitHub history. > Today claw-code officially launched as an independent project with a formal press release. And the Rust port merged today - what started as a panic rewrite now ships release 0.1.0. > 140K stars. 102K forks. More than Anthropic's own repo. > 512,000 lines are in the wild forever. What started as Anthropic's biggest embarrassment just became their most dangerous competitor. You cannot make this up.


Peak-hour limits are tighter and 1M-context sessions got bigger, that's most of what you're feeling. We fixed a few bugs along the way, but none were over-charging you. We also rolled out efficiency fixes and added popups in-product to help avoid large prompt cache misses

Once again it is time to pull out the sign:

Digging into reports, most of the fastest burn came down to a few token-heavy patterns. Some tips: • Sonnet 4.6 is the better default on Pro. Opus burns roughly twice as fast. Switch at session start. • Lower the effort level or turn off extended thinking when you don't need deep reasoning. Switch at session start. • Start fresh instead of resuming large sessions that have been idle ~1h • Cap your context window, long sessions cost more CLAUDE_CODE_AUTO_COMPACT_WINDOW=200000 We're rolling out more efficiency improvements, make sure you're on the latest version. If a small session is still eating a huge chunk of your limit in a way that seems unreasonable, run /feedback and we'll investigate

"Using coding agents well is taking every inch of my 25 years of experience as a software engineer." Simon Willison (@simonw) is one of the most prolific independent software engineers and most trusted voices on how AI is changing the craft of building software. He co-created Django, coined the term "prompt injection," and popularized the terms "agentic engineering" and "AI slop." In our in-depth conversation, we discuss: 🔸 Why November 2025 was an inflection point 🔸 The "dark factory" pattern 🔸 Why mid-career engineers (not juniors) are the most at risk right now 🔸 Three agentic engineering patterns he uses daily: red/green TDD, thin templates, hoarding 🔸 Why he writes 95% of his code from his phone while walking the dog 🔸 Why he thinks we're headed for an AI Challenger disaster 🔸 How a pelican riding a bicycle became the unofficial benchmark for AI model quality Listen now 👇 youtu.be/wc8FBhQtdsA



Peak-hour limits are tighter and 1M-context sessions got bigger, that's most of what you're feeling. We fixed a few bugs along the way, but none were over-charging you. We also rolled out efficiency fixes and added popups in-product to help avoid large prompt cache misses

I hear you. I have all of the same things going on: max plans, openrouter usage, local infra. I understand the qualitative difference in the models. But I still stand behind my point. This isn't the first time Anthropic has pulled this. They've silently served quantized models at peak periods without disclosing it and have been caught red-handed. OpenAI had a similar quality degradation that I know you remember, and never had a proper root cause. It's gambling to put all the eggs into their baskets. 1/ if you trust their top-line pricing as a proxy for capacity/cost, you are hoping they achieve 2 OOMs of perf improvements and reflect it in their rate limits on these plans. 2/ that this won't be a never-ending cycle of needing to chase the qualitative benefits of using the latest model that also won't yet have those perf benefits/liberal rate limits. As an observer, one thing I'd like to point out is that I watch all of interesting work you are doing and product you are releasing to improve agentic harnessing and orchestration. You are clearly doing a great service while offering a lot of it freely to others. But the one area I don't see you pointing all of your token generation at is building the tooling that will help lessen your dependency on these models. I think a bit of your focus can probably go there to hedge against your future assumptions and needs.

Peak-hour limits are tighter and 1M-context sessions got bigger, that's most of what you're feeling. We fixed a few bugs along the way, but none were over-charging you. We also rolled out efficiency fixes and added popups in-product to help avoid large prompt cache misses






