johnineson

4.8K posts

johnineson

@johnineson

I will study and prepare myself, and someday my chance will come.

Katılım Ekim 2012

1.7K Takip Edilen139 Takipçiler

johnineson@johnineson·1d

@lcamtuf joelonsoftware.com/2000/04/06/thi… @spolsky taught us about rewrites over 25 years ago, but every generation insists on learning the hard way. Rewrites are always underestimated, and rarely worth it.

English

123

17.9K

lcamtuf@lcamtuf·1d

The coreutils Rust rewrite story is pretty funny. Coreutils are tools like rm, mv, mkdir, etc. Unlike binutils, this isn't a fertile ground for memory safety bugs. But, the rewrite was completed, and in the spirit of progress, Canonical decided to switch. 🡇

English

1.4K

204.7K

johnineson@johnineson·1d

Everyone has great models, or will soon. That's not going to be a differentiator. Hearts and minds / brand is one way to gain an edge.

English

johnineson@johnineson·1d

OpenAI has seen Anthropic's stunning growth, but also its weaknesses, and is twisting the knives as hard as it can. Run out of compute -> we have plenty Treating customers like crap -> we'll treat them great

Sam Altman@sama

we are gonna do something nice for everyone who applied for the GPT-5.5 party and that we didn't have space for. hope you enjoy!

English

johnineson@johnineson·1d

@Everlier @krzyzanowskim @nikitabier Share your contacts to find your friends on GitHub! 😄

English

Everlier@Everlier·1d

@krzyzanowskim @nikitabier What if Nikita Bier built GitHub?

English

2.3K

Marcin Krzyzanowski@krzyzanowskim·1d

please, I beg you @nikitabier , somebody here add patterns to muted words. It is getting out of hands

English

2.2K

92.5K

johnineson@johnineson·1d

@DanielLockyer That's not the weird thing. Of course they have outdated knowledge. The weird thing is: model providers never seem to train for that! Surely it's trivial to train the model to think and act with the assumption that it's operating in the future, not frozen in time.

English

1.2K

Daniel Lockyer@DanielLockyer·1d

one of the most frustrating things about AI review bots: outdated knowledge

English

1.2K

65.3K

johnineson@johnineson·4d

@buccocapital @bhalligan There was an old anecdote about a call-centre optimising call duration by just hanging up calls... I'd say the right equation is more complex and has to include all of your real objectives, e.g. - High resolution rate - Low time to resolve - Low all-in costs (agent, humans, etc)

English

132

BuccoCapital Bloke@buccocapital·4d

To push on aligned incentives...and this is not just unique to you all...we are starting to see tension between pricing on outcomes and customer experience. Or at least end customer voicing the concern about incentives not being aligned If you get paid on a confirmed resolution...your incentive is not *quite* my incentive. How do I know you won't keep the customer trapped to try to resolve the issue vs escalating it to a human? Interesting one to think through...

English

7.7K

Brian Halligan@bhalligan·4d

HubSpot’s agent pricing Prospecting agent - $1 per qualified lead. Customer support agent - $.50 per resolved conversation. Both agents work well and now have aligned incentives. If using HubSpot, give them a go!

English

431

99.9K

johnineson@johnineson·5d

@blondesnmoney Sir, please do not question the moneyfountain. Thank you.

English

johnineson retweetledi

Roko 🐉@RokoMijic·24 Nis

We're doing the "Blender" game again There is a large blender. Everyone in the world has to decide whether to step into the blender. If at least 50% of the people do step into the blender, it will be unable to overcome their inertia to get started, and everyone survives. If less than 50% of the people step into the blender, then they all get blended up into paste and die. People who do not step into the blender suffer no adverse effects. Would you step into the blender? (Blue=step into the blender, Red= don't do that)

Tim Urban@waitbutwhy

Everyone in the world has to take a private vote by pressing a red or blue button. If more than 50% of people press the blue button, everyone survives. If less than 50% of people press the blue button, only people who pressed the red button survive. Which button would you press?

English

221

169

427.7K

johnineson@johnineson·25 Nis

@VictorTaelin Do you have some more detail on which models you tested? E.g. which version of Qwen 3.6

English

Taelin@VictorTaelin·25 Nis

GPT 5.5 is much smarter than I thought Yesterday, I did one-shots, coding, benchmarks, and was disappointed. Today, I did it all again, except via the API, which is now available. Results changed completely: → one-shot prompts went from bad to very good → excellent coding outputs, on both pi and holefill → benchmarks jumped, and now GPT *dominates* I don't know what happened, I suppose there is something wrong with my Codex. In any case, truth is this model is very smart. It obliterated my benchmark, which is crazy because some of these problems were meant not to be solved. I'll need much harder tasks. I also fixed 2 bugs that affected some providers: → added a retry for lost connection → removed the timeout limit DeepSeek and Kimi wanted to spend more than 1 hour on my prompts, so I let them. Their results are much better now. Kimi K2.6 almost reaches Sonnet 4.6, although much slower. Also this shows my points from last post were wrong Again: this is a new vibe-coded bench, I'm focused on other things, so expect bugs and don't over-read this! GLM 5.1, Gemma, Grok are not updated yet.

English

128

120

1.9K

171.5K

johnineson@johnineson·25 Nis

@buccocapital Imagine pursuing a career that you hate so much. It's so unrewarding that you'd rather spend your best years as a pauper than work into your early 40s. These people have made some terrible life-choices.

English

747

BuccoCapital Bloke@buccocapital·25 Nis

The thing I find so confusing about FIRE is the arrogance You could be dead tomorrow. You will have wasted your entire life Life was meant to be lived

Ramit Sethi@ramit

"Is pushing for a 60% savings rate destroying my marriage?" Yes

English

141

2.1K

257.5K

johnineson retweetledi

Awni Hannun@awnihannun·24 Nis

Adopting Claude speak in my regular life, episode 1: Partner: Did you do the dishes tonight? Me: Yes they're done. Partner: Why are they still dirty? Me: You're right to push back. I didn't actually do them.

English

397

3.8K

55.9K

1.8M

johnineson@johnineson·23 Nis

@steipete @thsottiaux Huh? Sorry, but that page is about Codex credits for Open Source?! I see nothing about the terms of paid subscriptions. Until now, I haven't used OpenClaw because I don't want to be the next person to be banned. We need the OK in official docs, not just a personal tweet.

English

Peter Steinberger 🦞@steipete·23 Nis

@johnineson @thsottiaux Using the OpenAI sub is allowed. e.g. we mention this in developers.openai.com/community/code…

English

303

Tibo@thsottiaux·22 Nis

Team is hard at work together with @steipete to make OpenAI models and ecosystem be the obvious way to to enjoy your claw. A lot more to come next week, but a reminder that you can use OpenClaw as part of your ChatGPT subscription today already. (also still having too much fun with ChatGPT Images 2.0 today)

pash@pashmerepat

I've embarked on a new sprint. My mission is to make OpenAI models feel magical in OpenClaw in the next few weeks. Diving in today, I noticed a bug. When you configured OpenClaw to use the Codex harness with OpenAI models, auth was broken, and the system was silently falling back to the Pi harness. So nobody knew it was broken. Two PRs later (fix the auth bridge, stop the silent fallback), the Codex harness actually works. And the difference is night and day (pic related). Before: the agent didn't feel magical or proactive. It did the exact same shallow loop every heartbeat. Read the heartbeat file, check Discord, see nothing, say HEARTBEAT_OK. It ignored the rest of its instructions. Sometimes it would even reason about doing work and then just... not issue the tool calls. After: full agent loops. It reads its workspace context, interprets the entire checklist, inspects the repo, makes real edits, tries to verify them, and gives honest status reports when things are blocked. Later heartbeats show continuity, it doesn't repeat work, it picks up where it left off. I didn't change any prompting or scaffolding. Just swapped in the codex harness for pi. Lesson here is use the codex harness if you're building with OAI models. A lot more to do but this is a strong start.

English

229

110

2.5K

450.9K

johnineson@johnineson·17 Nis

@bcherny UI says "Welcome to Opus 4.7 xhigh!" Even when effort is not set to xhigh. That doesn't seem very clear.

English

Boris Cherny@bcherny·16 Nis

In Claude Code the default effort is now xhigh, a new level between high and max giving finer control over the reasoning/latency tradeoff. 4.7 thinks more, so token use runs higher than 4.6. Manage it with effort, task budgets, or prompting for brevity.

English

213

29.5K

Boris Cherny@bcherny·16 Nis

Opus 4.7 is in Claude Code today. It's more agentic, more precise, and a lot better at long-running work. It carries context across sessions and handles ambiguity much better.

Claude@claudeai

Introducing Claude Opus 4.7, our most capable Opus model yet. It handles long-running tasks with more rigor, follows instructions more precisely, and verifies its own outputs before reporting back. You can hand off your hardest work with less supervision.

English

378

182

3.1K

232.8K

johnineson@johnineson·16 Nis

@GergelyOrosz Anthropic is evidently winning on growth, model and product quality, but starved for compute. So I bet management has said: "We MUST reduce inference load. Try not to lose too much quality." A difficult tightrope to walk. Let's see if their reputation survives it.

English

223

Gergely Orosz@GergelyOrosz·16 Nis

Claude just keeps regressing for me, day after day. I swear that until a few days ago, when Claude did not know something, it kicked off a web search, figured out, and answered. Now it just refuses to do the work that I pay for. It's like showing you the middle finger. Really?

English

248

2.2K

198.8K

johnineson@johnineson·16 Nis

@VictorTaelin Anthropic is evidently under huge scaling pressure, and not model quality or product pressure. So you can bet anything that comes out in the immediate future will be optimised for lower inference load. And they will dilute the quality a little to achieve that.

English

959

Taelin@VictorTaelin·16 Nis

Seems like we get Opus 4.7 today? Is this the first time a lab announces a more powerful model exists and ships a less powerful variant? I wonder if Opus 4.7 is a smaller variant of the same Mythos pre-train, or just a continuation of the 4.6 we have...

English

592

53.2K

johnineson@johnineson·15 Nis

@blondesnmoney 🐐

QME

154

johnineson retweetledi

Robinson Meyer@robinsonmeyer·14 Nis

The current status quo is to squint at slide 43/56 on the “Amenities” page to see if when you look past the three treadmills, two ellipticals, and one rolled-up yoga mat, there’s anything heavier than 35 lbs. on the smudgy reflection of a dumbbell rack across the room

English

538

28.1K

johnineson@johnineson·11 Nis

@buccocapital I agree Adobe doesn't have lock-in, but don't you think Microsoft still has a chance to catch up? It's so deeply integrated in many orgs and such an undertaking to rip out and replace.

English

700

BuccoCapital Bloke@buccocapital·11 Nis

Stop talking about SBC. It. Does. Not. Matter. Microsoft is getting killed. Adobe is getting killed Will anyone complain about SBC at OpenAI or Anthropic or Databricks or SpaceX? No Is SBC a cost. Yes. Of course. But it’s not about that. This is about terminal value.

English

942

158.4K

johnineson@johnineson·10 Nis

@AltayCapital @blondesnmoney I second this. Mine was a smidgen over $250.

English

AltayCap@AltayCapital·10 Nis

@blondesnmoney I bought one of those high end Chinese robot vacuums ($700ish but they have $200 ones too) and it vacuums and mops my floors every couple days. Quite nice and makes my tile floors look great.

English

5.1K

Keşfet

@lcamtuf @spolsky @Everlier @krzyzanowskim @nikitabier @DanielLockyer @buccocapital @bhalligan