erik retweeté
erik
601 posts

erik
@easel
Data Hacker. CTO and Cofounder at https://t.co/BiP4BcniIt. CTO at https://t.co/ncrsKFvzYP
38.895864,-77.042532 Inscrit le Aralık 2006
970 Abonnements294 Abonnés

100% this. Little chunks of reusable content you can use to steer your friendly clanker make a big difference.
Matt Pocock@mattpocockuk
Writing ADR's for agents has been such a good decision Capturing all the non-obvious decisions in a codebase makes every agent in your stack that touches the stack smarter It's the thinnest layer of docs that captures the stuff code can't
English
erik retweeté
erik retweeté
erik retweeté
erik retweeté

@dusterbloom @malikwas1f @davideciffa Between capability nerfs and usage limits local agents can’t happen soon enough!
English

@malikwas1f @davideciffa @easel Mate, appreciate and actually been working on it but Opus is so nerfed it is unbelievable ... can't get it to finish HumanEval+ ... fricking incredible
English

Huge thanks to @dusterbloom and @easel for implementing prefix caching + cold-start tuning and the upstream CUDA VMM fix. Luce PFlash is now ~10× faster warm, ~2.5× faster cold (block sparse attention autotune). Live for Qwen3.6 27B! 🏎️
github.com/Luce-Org/luceb…

English
erik retweeté

@pupposandro Got it running on my RTX 5090 Laptop under WSL. Awesome! Next step, getting it wired up to ddx agent.

English

Qwen3.6-27B at 35 tok/s on a GB10 DGX. Almost 3× faster than vLLM+DFlash, 9× vs vLLM bf16.
Luce DFlash is now available on Blackwell consumer GPUs. 5090 and GB10 owners, you've been asking.
OpenAI-compatible tool calling works out of the box, so it drops straight into OpenCode, Hermes, Cline, whatever you run.
Huge thanks to the incredible @superoo7 for shipping this to the community. Repo in the first comment.

English

@Steve_Yegge Teams limits are way below Max 20x and not pooled very well at all, from my anecdotal experience. Off hand, I'd guess that the $150 teams seats might get a similar budget to the $100/mo max plan.
English

This is my third $200/month Claude Pro Max plan. My first two have maxed out their weekly limits, and now I'm about to hit a session limit in my third one. And I'm trying to dial UP my usage, with Gas Town.
Has anyone done the math to figure out when it becomes more economically sensible to buy the Teams package and give yourself all five seats?

English

I've been using gastown. This may not be the final iteration, but it is the way.
Steve Yegge@Steve_Yegge
It's been 12 days since I dropped Gas Town. The response has been off the charts. I've been working hard to keep up. Thanks to all the early adopters. I wrote up this survival guide. steve-yegge.medium.com/gas-town-emerg…
English
erik retweeté
erik retweeté


On the ground at SFO. I’ll be here all week attending #inbound2025. If you want to talk email and AI hit me up!
English
erik retweeté
erik retweeté

Had to stack up 8 Mac Minis to get it running.
~5 tok/sec for now.
First time running inference on 8 Mac Minis - performance can be improved a lot (theoretical limit is >10 tok/sec on this setup).

EXO Labs@exolabs
Running DeepSeek-V3 on M4 Mac Mini AI Cluster 671B MoE model distributed across 8 M4 Pro 64GB Mac Minis. Apple Silicon with unified memory is a great fit for MoE.
English
erik retweeté

When SaaS products grow and the architecture starts to break down, two types of devs emerge:
1. Those who believe the system is beyond repair and needs to be totally rebuilt
2. Those who put their heads down and fix it one piece at a time
I've only seen group #2 be successful.
English
erik retweeté

People have too inflated sense of what it means to "ask an AI" about something. The AI are language models trained basically by imitation on data from human labelers. Instead of the mysticism of "asking an AI", think of it more as "asking the average data labeler" on the internet.
Few caveats apply because e.g. in many domains (e.g. code, math, creative writing) the companies hire skilled data labelers (so think of it as asking them instead), and this is not 100% true when reinforcement learning is involved, though I have an earlier rant on how RLHF is just barely RL, and "actual RL" is still too early and/or constrained to domains that offer easy reward functions (math etc.).
But roughly speaking (and today), you're not asking some magical AI. You're asking a human data labeler. Whose average essence was lossily distilled into statistical token tumblers that are LLMs. This can still be super useful ofc ourse. Post triggered by someone suggesting we ask an AI how to run the government etc. TLDR you're not asking an AI, you're asking some mashup spirit of its average data labeler.
English

@mitsuhiko I have a Trollberget in my office as a fallback for my standing desk. It's a good quality stool, durable and functional. If the look works for you, I'd go for it.
English










