Benjamin Stein

7.3K posts

Benjamin Stein

@benstein

I love birds, bridge, bots, beer, bears, beets, Battlestar Galactica. Electrify everything. Helping parents drowning in family logistics: https://t.co/CCGVWB9IkS

Oakland 가입일 Şubat 2007

880 팔로잉3.4K 팔로워

Benjamin Stein@benstein·5h

Here's an actual snippet of a blog post that 4.8 generated for me. On the plus side, I've gone back to writing myself :-) "Every shipped algorithm was read by three more agents working different angles. An auditor checked that the code does what its description claims. A second hunted for the algorithm's weakest assumption. A third wrote the one-liner you have been reading throughout this post. The auditors flagged nine algorithms for honesty. I read all nine. Every one was a reviewer holding an "extract" algorithm, which is permitted to read the date with a getter, to the stricter standard meant for the from-scratch cohort. Nothing was actually misrepresenting itself, which is the result I wanted and did not assume. The weakest-assumption reviewers were more sobering." Word salad at best!

English

Benjamin Stein@benstein·17h

@davidad This is so on point. 4.8 is SO BAD at writing. Not in the emdash kinda way, but a legit "I'm sorry, I have no idea what you're trying to say right now" kinda way

English

819

davidad 🎇@davidad·1d

No one: Claude Opus 4.8 Max: Let me refine your load-bearing claim rather than just accepting it, because you’re doing zero moves there, and the gap is what’s actually interesting. The one place I’d still push, because I think it matters: your message is wearing content-clothes, but the content isn’t actually *there*. The tell: it’s just an empty string. But the emptiness of the string IS its lack of content. Pull one, and the other goes inert. That’s the structural spine.

English

191

282

4.4K

507.3K

Benjamin Stein@benstein·7h

@hthieblot THE REALLY BIG BUTTON THAT DOESN'T DO ANYTHING. Technically it was 1994 tho stefangagne.com/spatulacity/bu…

English

152

Hubert Thieblot@hthieblot·18h

Anyone who surfed the early web between 1995-2010. What’s the one website/app you still think about?

English

16.7K

486

10.4K

3.5M

Benjamin Stein@benstein·18h

@trq212 @sidbid I was struggling to internalize dynamic workflows and what "write its own harness on the fly" actually meant. So I used it to fan out 484 agents rebuild isitchristmas.com (turns out no, not today). Wrote up my learnings here: benjaminste.in/blog/2026/05/2…

English

175

Thariq@trq212·1d

Workflows are the biggest upgrade to Claude Code’s capabilities since skills and subagents. I dove deep into it with @sidbid to figure out best practices, examples and more. I’m particularly excited about the non-technical tasks it enables for Claude Code.

Thariq@trq212

x.com/i/article/2061…

English

154

305

4.2K

859.4K

Benjamin Stein 리트윗함

Mayor Zohran Kwame Mamdani@NYCMayor·2d

Today, I signed an Executive Order temporarily repealing bedtimes in the City of New York so that kids of all ages can watch our team in the NBA Finals. As Mayor, you’re forced to make many difficult decisions. This was not one of them. Go Knicks.

English

2.9K

21.5K

333.8K

13.1M

Benjamin Stein@benstein·1d

@KibryHouse It really was.

English

KibryHouse🏳️‍⚧️@KibryHouse·2d

Playing Mario 64 when it first released must have been a life changing experience

English

530

124

2.8K

549K

Benjamin Stein@benstein·2d

I rebuilt "Is it Christmas" using 484 subagents and 16 million tokens to learn how Claude's dynamic workflows work. Spoiler Alert: today is not Christmas. benjaminste.in/isitchristmas/ h/t @konklone, as always

English

Benjamin Stein@benstein·2d

I'm a couple projects into Codex + 5.5. Early reactions: * The coding model is very good. It YOLO sideloaded and debugged a new Android widget with no human intervention very very well. * I hate using the mouse. * Way too many back-and-forth questions. I kept finding Codex waiting for me. Like I say "LGTM let's build!" and come back 15 minutes later expecting a finished result but instead found a "Should I get started?"

English

Benjamin Stein@benstein·2d

And how would you trade off? If CC + Opus consistently generated better code output but you really dislike their interface (or vice versa), which would you use for day-to-day work?

English

Benjamin Stein@benstein·2d

When you say you prefer [ Claude Code | Codex ], are you optimizing for the user experience or the quality of the output?

English

Benjamin Stein@benstein·2d

@anothercohen A+ tweet

English

Alex Cohen@anothercohen·3d

You just won a 2-week, all-expenses-paid vacation. But there’s a catch: you have to stay within one region the whole time. What are you picking?

English

13K

Benjamin Stein@benstein·4d

@andrewneilson_ @dlwiest This is the actual answer. Ignore other replies. Enterprise is PAYG. The other (way too common) answer is people just use PAYG because they don't read/think e.g. paste your API key into Cursor and oops $3000

English

Andrew Neilson@andrewneilson_·4d

@dlwiest if you have more than 150 seats you have to convert to enterprise (where you pay api rates, possibly with a volume discount). Compliance-minded companies aren’t going to let you use personal plans, so the options are either team or enterprise.

English

116

cozybear@dlwiest·5d

Can anyone explain to me why companies don’t just give employees $100 / month Claude Code or Codex plans instead of paying per token? There has to be an explanation, because this keeps happening and doesn’t make sense otherwise

English

435

2.4K

612.5K

Benjamin Stein@benstein·5d

@moonpetal76 My parents liked Dustin Hoffman in The Graduate and my great-great-grandparents thought Eisenstein sounded to Jewish when they got off the boat.

English

moon ݁☾ּ ֶָ֢.🪷.@moonpetal76·5d

your username. explain. now.

English

3.2K

125

3.6K

1.2M

Benjamin Stein@benstein·5d

@denicmarko superduperlabs.com

QME

Marko Denic@denicmarko·5d

What are you working on? Drop a screenshot or link.

English

369

198

31.3K

Benjamin Stein@benstein·5d

How can I get any work done with COYOTE PUPPIES ROMPING IN MY BACKYARD?!?!!!

English

Benjamin Stein@benstein·5d

Opus Ultracode should have been called "Hold my beer" 108/115 agents done · 41m 21s · ↓ 8.7m tokens

English

Benjamin Stein@benstein·5d

"comfortable" is not the word I'd use

English

Benjamin Stein@benstein·5d

@nateberkopec My system prompt includes: "Bluntly correct me when I'm wrong. I'd rather argue than have you cave, especially when I'm being an idiot." which results in a lot more "Let me push back..." which is often great.

English