Tom Arnold

56 posts

Tom Arnold banner
Tom Arnold

Tom Arnold

@FoundryLabsTom

AI business systems research, implementation and consulting. Former network support engineer and technical SASE security product specialist at Cisco Meraki.

Indianapolis IN Katılım Aralık 2025
136 Takip Edilen13 Takipçiler
Tom Arnold
Tom Arnold@FoundryLabsTom·
@plainionist Nope - I don’t miss sifting through stack overflow at all lol
English
0
0
1
25
Seb
Seb@plainionist·
Could you imagine going back to coding without AI? 🤔
English
8
1
3
493
Tom Arnold
Tom Arnold@FoundryLabsTom·
@iatnon @BillyBowss @ClaudeDevs Unfortunately it's the correct answer though. I lasted about 2 days on the 5x plan last summer before needing to bump up to 20x.
English
1
0
0
107
Antoni
Antoni@iatnon·
I'm actually devistated... What am I supposed to do the rest of the week?!?! Only half way through this week and ran out of weekly limits. CLAUDE WHAT HAVE YOU DONE!?! Never even came close to weekly limits before @ClaudeDevs please fix this!
Antoni tweet media
English
66
4
156
41.8K
Tom Arnold
Tom Arnold@FoundryLabsTom·
@Bhavani_00007 The Claude Vs. Codex mindset is wrong - you should be using both to their strengths.
English
0
0
2
511
Bhavy☄️
Bhavy☄️@Bhavani_00007·
I'm a Claude user. Give me one reason to switch to Codex
Bhavy☄️ tweet media
English
364
10
580
161.5K
Dave W Plummer
Dave W Plummer@davepl1968·
If lines of code were a meaningful measure of software quality, then the ideal program would be an infinitely long, labyrinthine monument to redundancy in which every variable enjoyed its own paragraph, every conditional branch was lovingly expanded into a dissertation, every reusable function was duplicated dozens of times for the sake of numerical abundance, and every simple idea was buried beneath geological strata of boilerplate, thereby rewarding verbosity over clarity, complexity over elegance, maintenance burden over maintainability, bug surface area over correctness, and programmer ego over engineering discipline, while simultaneously ignoring the inconvenient reality that the history of computing is littered with brilliant achievements—from hand-crafted assembly routines to elegant algorithms and compact operating system components—that derived much of their value precisely from accomplishing more with less, making lines of code roughly as sensible a productivity metric as measuring the quality of a novel by its weight, the efficiency of an engine by the amount of metal in the block, or the intelligence of an argument by the number of words it takes to arrive at the point. Brevity is wit.
roon@tszzl

lines of code is a better metric than people think it is. token use is a better metric than people think it is

English
38
18
250
16.1K
Tom Arnold
Tom Arnold@FoundryLabsTom·
@AnthropicAI Would be interested to hear the details on what it took for the quality of code produced to to be on par with human code. Like what is the baseline for labor hour input for AI produced code to match good human produced code for the same objective.
English
0
0
1
928
Anthropic
Anthropic@AnthropicAI·
The speedup isn’t just in volume. On open-ended coding problems where answers are unclear, Claude’s success rate is now 76%—a 50 point jump in just 6 months. Many engineers also say Claude’s code quality is now on par with human code; we expect it to be better within the year.
Anthropic tweet media
English
43
114
2.2K
487.1K
Anthropic
Anthropic@AnthropicAI·
Our internal data shows Claude is accelerating AI development—a possible path to recursive self-improvement, or AI autonomously building a more capable successor. It’s happening faster than we thought, and the implications deserve greater attention. anthropic.com/institute/recu…
English
1.7K
4.6K
28.1K
17.8M
Tom Arnold
Tom Arnold@FoundryLabsTom·
One of the more annoying issues I surfaced earlier this week for some of my long running Claude Code workflows is validator/verification agents focusing almost entirely on issues present within specs/code/reports etc. and overlooking what's not present, like issues you'd expect uncovered during a gap analysis. So Monday and Tuesday of this week I was running into the same types of issues you've outlined: Architect or another agent finds bug or issue > creates work order > work order processed by validation agent > work order updated by validator > implementation agent implements > something overlooked and new bug implemented > cycle repeats until complete. The goal of implementing the fix is ultimately achieved but required additional loops = opportunity for efficiency improvements. The fixes have been simple enough for Claude in this case - I needed to improve my prompting/instructions to consider not just the code or artifact being validated, but to also think about what is missing, or outside of the immediate blast radius. That might sound obvious but things were working well previously and I liked the way my workflows were running for several weeks, with minor adjustments here and there. Not sure if this is more a quirk specific to 4.8, but I can see this got me multiple times in a couple of days for my Claude Code workflows after the release of 4.8. Of course, any new model release requires some adjustments to skills built on older releases. I just found your post interesting with what I've observed recently, as well. 🙂
English
0
0
1
88
Seb
Seb@plainionist·
My AI agents are starting to feel like a real team. Architect identified a potential bug and asked Verifier to confirm. Verifier confirmed. Implementer fixed it. But the fix broke another scenario. Verifier escalated back to Architect. Architect reviewed and approved the next step. Eventually everything passed. AI teamwork 😉
English
20
0
24
1.5K
Tom Arnold
Tom Arnold@FoundryLabsTom·
@plainionist I love hearing interesting ways ppl implement in inter-agent collaboration. What is your framework for this team?
English
1
0
1
38
ClaudeDevs
ClaudeDevs@ClaudeDevs·
We've changed the trigger word from "workflow" to "ultracode". You can still say "use a workflow for this", but when you're clearly referring to something else, Claude won't kick off a dynamic workflow. For an explicit trigger, use "ultracode". We appreciate the feedback!
ClaudeDevs@ClaudeDevs

New in Claude Code (research preview): dynamic workflows. Claude writes an orchestration script on the fly, then spins up a large fleet of coordinated subagents in parallel to take on your most complex tasks. Use the word "workflow" in a prompt to get started.

English
273
305
5.1K
569.8K
ThePrimeagen
ThePrimeagen@ThePrimeagen·
Looking for new fantasy book: * I loved Wheel of time * Liked mistborn * Loved storm light archives book 1/2, liked 3/4, meh on 5. (Just turned Kalladin into a soyboy) *Another kingdom enjoyed What else should I read?
English
810
11
762
137.4K
Dan B
Dan B@BachelderDan·
@ThePrimeagen Dungeon crawler carl. The best audiobook series I've ever listened to.
English
8
1
51
2.1K
Tom Arnold
Tom Arnold@FoundryLabsTom·
@wbuxtonofficial He still has time! Who knows, maybe his Indy 500 experience this year will inspire him.
English
0
0
3
938
Tom Arnold
Tom Arnold@FoundryLabsTom·
I've been enjoying Opus 4.8 more than 4.7, but there are still some annoying issues and performance quirks, partly related to the Claude Code harness. But overall, it feels better... Still hesitant to use the API in my own harness due to cost though.
English
0
0
1
50
Tom Arnold
Tom Arnold@FoundryLabsTom·
@theo There's no incentive for them to improve efficiency while enterprise corps are still willing to torch absurd amounts of money on tokens... For the same results they could achieve with their own harnesses and decent model routing, at a fraction of the cost.
English
0
0
10
3.5K
Tom Arnold
Tom Arnold@FoundryLabsTom·
@theo What kills me is how they've rebranded Skills (put your instructions in a directory) as "Workflows!" My new favorite command stub: /Skills-on-LSD > workflows
English
0
0
1
2.5K
Tom Arnold
Tom Arnold@FoundryLabsTom·
Most workflows(skills) I build are essentially a large task list performed by multiple different models. When in CC it's sonnet/opus. Each task in the flow requires the agents to output their results in its own file in a directory. This happens for every task. Every workflow run can produces between 6-40 artifacts that can be analyzed later by another workflow. I execute anywhere from 20-100 workflows a day. This produces a large dataset I can audit and analyze. I built a hallucination check skill called `/badtrip`. It runs sonnet/opus against each other to catch hallucinations. Not only do I catch hallucinations, but I produce scorecards on model performance when validators aggregate results. I asked CC to visualize this skill and got this mediocre wall of markdown I can't even share in one screenshot.
Tom Arnold tweet media
English
0
0
0
22
Haider.
Haider.@haider1·
wait, is it just me, or opus 4.8 is getting dumber?
English
160
20
781
133.4K
Dave W Plummer
Dave W Plummer@davepl1968·
Dude, I retired like ten years ago. Since then, I've written and published two books, a metric crapton of software, learned 3D graphics and embedded systems, built a YouTube channel with a million subscribers, restored two cars, sent three kids off to college, learned to drive a race car, and now I'm a public speaker for fun. I don't have a garden. No chill needed.
HollyCabot@HollyCabot

New rant: I honestly don't get anyone wanting to retire in their 40s and 50s. Like WTF.. how much time can one spend in a garden or whatever the hell people do all day. My gosh, take a fucking vacation and then get back to life. We weren't created to "chill"

English
100
49
1.7K
95.7K
Tom Arnold
Tom Arnold@FoundryLabsTom·
@theo So far, from what I've experienced, the only thing you're missing is a buggy CC experience. Half of my long running workflows had API errors over the last hour, and it still outputs walls of nonsense text like it just had its first toke and can't shut up.
English
1
0
3
1.8K
Tom Arnold
Tom Arnold@FoundryLabsTom·
@claudeai Cool, now if you could fix this bug I've been running into for the past 2 hours that would be great.
Tom Arnold tweet media
English
0
0
0
914
Claude
Claude@claudeai·
Introducing Claude Opus 4.8: it builds on Opus 4.7 with sharper judgment, more honesty about its own progress, and the ability to work independently for longer than its predecessors. Available today at the same price.
Claude tweet media
English
3.7K
8.7K
67.5K
15.1M