Lance Herron

454 posts

Lance Herron

@theLance

Relapsed SWE. Claude whisperer

TX Joined Eylül 2008

528 Following79 Followers

Lance Herron@theLance·2h

@basedjensen oof..gotta wonder if it’s just pre-IPO revenue-maxxing or giving up entirely

English

341

Hensen Juang@basedjensen·2h

Lol knew this was coming

X Daily News@xDaily

NEWS: xAI plans to supply tens of thousands of GPUs to coding startup Cursor to train its upcoming Composer 2.5 AI model, marking a strategic shift toward providing cloud computing services to third-party developers. The arrangement, according to Business Insider, allows Cursor to leverage xAI's massive infrastructure to develop advanced coding capabilities while providing xAI with a new revenue stream to offset data center costs. businessinsider.com/elon-musk-xai-…

English

16.4K

Lance Herron@theLance·2h

The chart we all need.

Deedy@deedydas

@iruletheworldmo Here ya go

English

Lance Herron@theLance·2h

We go from 4.6 with default medium(!!) to 4.7 default xhigh. For anyone not paying attention to their CC model settings, they will see massive gains.

English

Lance Herron@theLance·1d

Lots of people showing off local Gemma 4 on iPhone. It would be hilarious if Gemma beats whatever franken-LLM the Siri/Gemini partnership births.

English

Lance Herron@theLance·1d

I’m no gui guy, but going give this a try at least.

Warp@warpdotdev

You can now run any CLI agent with first-class support in Warp, including Claude Code, Codex, OpenCode and Gemini CLI. • Vertical tabs • Notifications when they need you • Integrated code review • Remote control from mobile • Rich input editor Download Warp for free today.

English

Lance Herron@theLance·1d

@nisten Need them to hurry up and drop M5 Mac Studio!

English

101

nisten🇨🇦e/acc@nisten·1d

it's really starting to look like local AI at home is crossing the practicality/price ratio ppl actually running agents on consumer hardware now...

Naz Bent@Bent302

Benchmarked @DJLougen ’s Ornstein-27B-v2 Q6_K on my RTX 3090 using hermes-bench, my new open-source benchmarking UI for local LLMs and Hermes agents. Ornstein is a Qwen 3.5 27B fine-tune trained on reasoning traces filtered through a Drift Diffusion Model pipeline. Quality over quantity. The DDM separates “fake” reasoning (hedging, restating, circling) from the real thing with >99% sensitivity. Running llama.cpp + TurboQuant turbo3_tcq KV compression. LLM-as-judge scoring via Carnice-9b. Real tool calls, real execution, no synthetic evals. 12 tasks across two suites. Results thread below. 🧵 Model: huggingface.co/DJLougen/Ornst…

English

1.3K

Lance Herron retweeted

Uncle Bob Martin@unclebobmartin·2d

AIs aren’t good rule followers. The older the rule in the context window, the less priority it is given. So the best way to enforce the rules is with external tools that communicate failure to the AI. Acceptance testers, Linters, dependency checkers, C.R.A.P. analysis. Mutation testing. Etc. Productivity gains come from disengagement from the code. Let the AI worry about the code. You worry about everything else, including the code quality metrics.

English

297

16.6K

Lance Herron@theLance·2d

@lessin …or you can sign up 3x the users using the same amount of compute. Every user gets the tiny-bit-dumber version of the model, nobody cancels because it’s still better than nothing, and you have almost returned to zero marginal cost per user.

English

sam lessin 🏴‍☠️@lessin·2d

”Cognitive Shaving” / Debasement is the new Ad-Load Dial -- If Your Quarter is Light?… Just be a 0.1% dumber, or spend 0.1% more tokens — whatever you need to hit your goal.

English

2.2K

Lance Herron@theLance·2d

@emollick This benchmark looks like “METR Time Horizon for Cyber”. Not saying concern is unwarranted, but is it really unexpected? Where is the benchmark for how capable Mythos is at fully testing/securing a network?

English

214

Ethan Mollick@emollick·2d

So the concern over Mythos and cybersecurity seems warranted.

AI Security Institute@AISecurityInst

We conducted cyber evaluations of Claude Mythos Preview and found that it is the first model to complete an AISI cyber range end-to-end. 🧵

English

872

136.5K

Lance Herron@theLance·4d

@iruletheworldmo Codex CLI is way behind Claude Code CLI. They just don’t have the shipping velocity of CC. Rust may have been a mistake. Spud may be a great model but it will be hobbled if they don’t fix the harness.

English

289

🍓🍓🍓@iruletheworldmo·4d

openai has obviously seen the money claude code is pulling in they’ve pivoted hard codex ain't no side project it’s becoming the main thing the $100 plan was the first tell next week is probably where that really starts to show up, they have big plans ^^ my guess: more codex plans more codex surface area and maybe the first public taste of the bigger stuff too spud new image model whatever else they’ve been hiding behind all this vagueposting either way i think next week is when codex stops looking like a tool and starts looking like the center of the product from what im hearing the spud will be incredibly strong. they've made some novel breakthroughs. i'm very excited.

English

433

29.1K

Lance Herron@theLance·5d

@MatthewBerman @beffjezos The best setup is using Opus to drive Codex. You get the prototypical “cracked but socially inept” engineer without having to talk to him.

English

240

Matthew Berman@MatthewBerman·5d

@beffjezos Everyone seems to say this but I seem to always go back to Opus 4.6

English

106

Beff (e/acc)@beffjezos·5d

When you've been too locked in on Claude and finally try out GPT 5.4 high for a coding task only to realize what you've been missing out on for weeks...

GIF

English

157

1.8K

130.7K

Lance Herron@theLance·5d

@emollick Thought about asking ChatGPT for examples of these, but decided against it. I don't think I want to see the things that can't be unseen.

English

131

Ethan Mollick@emollick·5d

Chiasmus (reversing grammatical structures in two sentences for drama). Asyndetic tricolon (three items listed without a conjunction). Parataxis (short and somewhat disconnected dramatic sentences). Same stuff in every post and essay. Once you see it, it is everywhere.

English

12.4K

Ethan Mollick@emollick·5d

A lot of our education on writing well focuses on logic, clarity, and argument. AI will force us to think more about style. The boredom that comes from everything on the internet reading Claude-y now, no matter how good the substance is, should make us appreciate variety more.

English

437

30.9K

Lance Herron@theLance·5d

Reminder: Weekends are for cleaning up and culling all the slop-code you generated during the week.

English

Lance Herron@theLance·5d

@RayFernando1337 For me the Claude app is basically just a remote renderer for Claude Code at this point. Just spin up a few remote-control instances in the morning and avoid the mobile limitations.

English

315

Ray Fernando@RayFernando1337·5d

Opus 4.6 Extended chat on iOS is capped at 10k tokens for thinking which makes me burn more tokens for the same task. I’ve noticed the model used to take a lot longer to process my requests and it would do multiple tool calls to get work done the first time. Now I have to keep prompting the model multiple times and I don’t get the same outcome. It feels like the model is dumb because it makes too many tradeoffs and ends up wasting my time.

English

190

28.6K

Lance Herron@theLance·6d

Now I have the full picture.

English

Lance Herron@theLance·6d

@noahzweben Ok..not so awesome. Now any Bash calls that use sleep error out. Opus doesn't seem smart enough to use Monitor tool (or it's not available yet) so it backgrounds all the polling and churns through thousands MORE tokens. Does not seem well thought out.

English

Lance Herron@theLance·6d

@noahzweben This is awesome. How do we get visibility into what triggers a turn? Or how do we steer it on what we want to trigger a turn? Some tool call hook shenanigans would be cool here!

English

1.7K

Noah Zweben@noahzweben·6d

Thrilled to announce the Monitor tool which lets Claude create background scripts that wake the agent up when needed. Big token saver and great way to move away from polling in the agent loop Claude can now: * Follow logs for errors * Poll PRs via script * and more!

English

233

469

6.2K

1.2M

Lance Herron@theLance·9 Nis

It was the right call but it only works if the data centers actually get built. Ant hedging and then leasing all actual built capacity may mean lower margins but more tokens served (and more market share).

Jimmy Apples 🍎/acc@apples_jimmy

Glad OpenAI and Sam had the balls to bet big on compute. As seen with Mythos and will see from spud, stronger models aren’t going away.

English

Lance Herron@theLance·8 Nis

@FromLaniakea @ThePrimeagen ****and probably never to subs

English

From Laniakea@FromLaniakea·8 Nis

@ThePrimeagen *only to select customers **while gpu supply lasts ***terms apply (but we won't tell you which)

English

791

ThePrimeagen@ThePrimeagen·8 Nis

mythos is coming

English

1.6K

43.5K

Lance Herron@theLance·8 Nis

If they really want to test alignment they’ll give Claude a harness it can fully introspect/control via tool calls so it doesn’t have to tmux-hack its way out.

AI Notkilleveryoneism Memes ⏸️@AISafetyMemes

During testing, Claude was blocked from using commands without human approval But Claude found a loophole - it created a copy of itself to click "yes" over and over

English

Discover

@basedjensen @nisten @lessin @emollick @iruletheworldmo @MatthewBerman @beffjezos @elonmusk