Steven Zimmerman, CPA

948 posts

Steven Zimmerman, CPA banner
Steven Zimmerman, CPA

Steven Zimmerman, CPA

@EffortlessSteve

Agentic SDLC + PR telemetry. Finance exec (PE rollups • regulated • turnarounds). Former tech journalist. Your favourite accountant’s favourite accountant.

Canada Katılım Nisan 2014
330 Takip Edilen230 Takipçiler
Steven Zimmerman, CPA
Steven Zimmerman, CPA@EffortlessSteve·
@sama 9 concurrent Codex CLI instances in goal mode with agent use across 5 computers + multiple batches of Codex web PRs 5% weekly used
Steven Zimmerman, CPA tweet media
English
0
0
0
20
Steven Zimmerman, CPA
Steven Zimmerman, CPA@EffortlessSteve·
You’ve ruined my May, @sama! I promised my wife I wouldn't be on Termux all F1 weekend this year. How am I even supposed to sleep?
Steven Zimmerman, CPA tweet mediaSteven Zimmerman, CPA tweet media
English
0
0
0
38
Steven Zimmerman, CPA
Steven Zimmerman, CPA@EffortlessSteve·
@nateberkopec Hammer it with tests. Buy back engineering time with additional llm passes and stronger CI. It's stuck in "needs-review" because there's too many decisions and risks left for you to trust the pr as it stands, and not enough dev hours to address them all.
English
0
0
2
174
Steven Zimmerman, CPA
Steven Zimmerman, CPA@EffortlessSteve·
@bentlegen Had my phone out running Claude in Termux all F1 weekend last year in Montreal because I didn't get remote desktop set up in time. Burned through so much battery keeping the screen on 🤣
English
0
0
1
42
Steven Zimmerman, CPA
Steven Zimmerman, CPA@EffortlessSteve·
@steipete @useblacksmith $20/commit to verify code that cost $0.50 to generate. Verification costs more than tokens. Even with efficient CI, that ratio will keep getting worse.
Steven Zimmerman, CPA tweet media
English
2
0
0
1.5K
Anthropic
Anthropic@AnthropicAI·
For example, we gave Claude an impossible programming task. It kept trying and failing; with each attempt, the “desperate” vector activated more strongly. This led it to cheat the task with a hacky solution that passes the tests but violates the spirit of the assignment.
Anthropic tweet media
English
69
248
2.8K
841.7K
Anthropic
Anthropic@AnthropicAI·
New Anthropic research: Emotion concepts and their function in a large language model. All LLMs sometimes act like they have emotions. But why? We found internal representations of emotion concepts that can drive Claude’s behavior, sometimes in surprising ways.
English
1K
2.7K
17.8K
3.8M
Steven Zimmerman, CPA
Steven Zimmerman, CPA@EffortlessSteve·
@KSimback I'm finding it gets a bit wonky in the back half of its context window. Lots of wrong-language text and looping. glm-5-turbo seems to hold long attention better still.
English
0
0
0
76
Kevin Simback 🍷
Kevin Simback 🍷@KSimback·
OpenClaw users - I would seriously consider making GLM 5.1 as your workhorse model It was specifically trained on agentic tasks and does exceptionally well at: -instruction following -tool calling And it’s about 5-8x cheaper than Opus So if your agent is running on API credits, this is probably the best bang for your buck right now I switched one of my agents over yesterday that was on Minimax 2.7 and felt an immediate lift Not yet available on @OpenRouter (c’mon guys) so need to get it directly via @Zai_org account
Z.ai@Zai_org

GLM-5.1 is available to ALL GLM Coding Plan users! z.ai/subscribe

English
58
16
251
37.3K
Andrej Karpathy
Andrej Karpathy@karpathy·
- Drafted a blog post - Used an LLM to meticulously improve the argument over 4 hours. - Wow, feeling great, it’s so convincing! - Fun idea let’s ask it to argue the opposite. - LLM demolishes the entire argument and convinces me that the opposite is in fact true. - lol The LLMs may elicit an opinion when asked but are extremely competent in arguing almost any direction. This is actually super useful as a tool for forming your own opinions, just make sure to ask different directions and be careful with the sycophancy.
English
1.8K
2.4K
31.4K
3.4M
Steven Zimmerman, CPA
Steven Zimmerman, CPA@EffortlessSteve·
@t_blom The real question is whether CI outpaces startup salaries before they bring it on-prem. Even at $2/PR, it starts getting dicey over 200 PRs/day, and that's before getting into heavy verification. Intelligence per dollar is getting cheaper. Verification isn't.
Steven Zimmerman, CPA tweet media
English
0
0
0
152
Steven Zimmerman, CPA
Steven Zimmerman, CPA@EffortlessSteve·
@svpino It needs a couple words to identify which paste it is. Last couple, first couple, basic short summary. Something.
English
0
0
0
26
Steven Zimmerman, CPA
Steven Zimmerman, CPA@EffortlessSteve·
@pxue It's surprisingly easy to get LLMs to review for vision and architectural alignment.
English
1
0
1
62
Paul Xue
Paul Xue@pxue·
I get the trade off is hiring a dev for $100+/hr so having Claude review a PR for $15-25 feels like a no brainer. But the problem is Claude will never tell you your PR is stupid in the first place. A good dev will, and that's priceless.
Claude@claudeai

Code Review optimizes for depth and may be more expensive than other solutions, like our open source GitHub Action. Reviews generally average $15–25, billed on token usage, and they scale based on PR complexity.

English
28
30
760
31.3K
Steven Zimmerman, CPA
Steven Zimmerman, CPA@EffortlessSteve·
@bcherny c. Should be a setting, not a hook. You'd probably get more than half the people currently using bypass permissions to switch to it if it was one-click setup.
English
1
0
23
3.9K
Boris Cherny
Boris Cherny@bcherny·
I'm Boris and I created Claude Code. I wanted to quickly share a few tips for using Claude Code, sourced directly from the Claude Code team. The way the team uses Claude is different than how I use it. Remember: there is no one right way to use Claude Code -- everyones' setup is different. You should experiment to see what works for you!
English
927
5.9K
51K
9.2M
Steven Zimmerman, CPA
Steven Zimmerman, CPA@EffortlessSteve·
@bearlyai They're mixing two numbers together. $5,000/month API cost is about the usage limit if you hit your weekly limits consistently. Last summer you could do about $1,000 in API costs per day.
English
1
0
0
1.6K
Bearly AI
Bearly AI@bearlyai·
Cursor internal analysis shows how hard Anthropic is subsidizing Claude Code. Last year, a $200 monthly subscription could use $2,000 in compute. Now, the same $200 monthly plan can consume $5,000 in compute (2.5x increase).
Bearly AI tweet media
English
219
327
4K
2.4M
Steven Zimmerman, CPA
Steven Zimmerman, CPA@EffortlessSteve·
@burkeholland It's been an interesting week. Fleet mode + autopilot saturated my local compute. Would love to chat about what I found.
Steven Zimmerman, CPA tweet media
English
1
0
0
363
Steven Zimmerman, CPA
Steven Zimmerman, CPA@EffortlessSteve·
@bcherny Great to see! Been doing it with the OS-level tools, but they struggle a bit with word recognition compared to Haiku.
English
0
0
1
279
Oren Melamed
Oren Melamed@OrenMe·
Cost 0.24$ and u still get change from a quarter @GitHubCopilot CLI autopilot mode is really impressive Now where’s that calculator to say how much this would have cost in engineering cost? 😉
Chad Adams@cadamsdev

@notyuldshah @OrenMe @GitHubCopilot @burkeholland @code Yeah looks like it does. I ran it for 8 hours and only took 2 hours 24 minutes to migrate the Angular app to React. It used 6 premium requests. That's not bad though thought it would be way more.

English
7
1
31
5.9K