Taylor Eernisse

535 posts

Taylor Eernisse

Taylor Eernisse

@theirongolddev

Tham gia Haziran 2025
107 Đang theo dõi33 Người theo dõi
Taylor Eernisse
Taylor Eernisse@theirongolddev·
Just try it out for a couple days - the problems arise very quickly. Build a system in one harness, symlink things into the other harnesses, then try doing the exact same work in those other harnesses; you’ll immediately notice output variations ranging from minor annoyances to catastrophic divergence from requirements. Claude and GPT run in very different harnesses. It’s unreasonable to expect them to work the same given the same context, as inconvenient as that is, because harness context is so influential on output.
English
1
0
1
12
@ramedey
@ramedey@RaulAmedey·
@pbakaus Do you have any examples? Could you elaborate?
English
2
0
0
80
Paul Bakaus
Paul Bakaus@pbakaus·
1. don't symlink Claude.md to AGENTS.md 2. don't symlink skills "y u no like in-sync instructions?!" you say. fair, feels like a good idea. even npx skills suggests it as default. sadly, different harnesses/models need different prompting and have different tools, agents.
English
3
1
22
19.7K
Taylor Eernisse đã retweet
Tim Ferriss
Tim Ferriss@tferriss·
Ninety-nine percent of people in the world are convinced they are incapable of achieving great things, so they aim for the mediocre. The level of competition is thus fiercest for “realistic” goals, paradoxically making them the most time- and energy-consuming. If you are insecure, guess what? The rest of the world is, too. Do not overestimate the competition and underestimate yourself. You are better than you think. Unreasonable and unrealistic goals are easier to achieve for yet another reason. Having an unusually large goal is an adrenaline infusion that provides the endurance to overcome the inevitable trials and tribulations that go along with any goal. Realistic goals, goals restricted to the average ambition level, are uninspiring and will only fuel you through the first or second problem, at which point you throw in the towel. If the potential payoff is mediocre or average, so is your effort. The fishing is best where the fewest go, and the collective insecurity of the world makes it easy for people to hit home runs while everyone else is aiming for base hits. There is just less competition for bigger goals.
English
232
891
7.5K
490.7K
Taylor Eernisse
Taylor Eernisse@theirongolddev·
You’re right, but you’re describing the difference between a thoughtfully built tool bespoke to your specific usage and one that can be served to the masses. If you want this, go build it (you certainly have access to the systems and tooling), tuned to your specific needs. Expect any third party system to have made compromises that you can’t live with.
English
0
0
0
167
Evis Drenova
Evis Drenova@evisdrenova·
my ideal AI interface is a single never-ending chat thread. i don't want to think about the concepts of sessions, context windows, worktrees, mcp servers or really anything else. The harness should automate everything transparently.
English
193
66
1.7K
179.4K
Taylor Eernisse
Taylor Eernisse@theirongolddev·
@PythonPr Close, but not quite I’m not the one whipping them, I’m in bed after delegating the whipping to another Claude agent
English
1
1
2
265
Taylor Eernisse đã retweet
Ashwin Gopinath
Ashwin Gopinath@ashwingop·
Claude Tag is a Trojan horse.  Not because Anthropic is doing anything evil. Because the incentives are obvious. Day one, this looks like a great feature: tag Claude in Slack, let it follow the thread, remember context, connect to tools, break down tasks, chase work, and act like a teammate. But that is exactly the problem. The moment your AI vendor becomes a shared coworker, it stops being just a model provider. It starts becoming the place where work is interpreted, remembered, routed, and eventually executed. That is not model lock-in. That is context lock-in. You are now renting your company back from them. Models can be swapped. Agents can be copied. But the memory of how your company actually works is much harder, maybe impossible, to move: the Slack scar tissue, the exception paths, the customer promises, the unfinished threads, the weird workflows, the implicit owners, the “we tried that in Q2 and it failed” knowledge. Once that lives inside one vendor’s agent layer, you are not renting intelligence anymore. You are renting your company’s operating memory. And the pricing model makes it even more dangerous. A human coworker has a salary. Claude has unbounded tokenized activity. The more work moves through it, the more the vendor captures not just IT spend, but labor spend. This is the enterprise bargain people will regret: Convenience now, and rapid decent into dependency. The right architecture is simple: rent the best intelligence from whoever is best this month. OpenAI, Anthropic, Gemini, open source, whatever. But own the context layer. Your company memory should be inspectable, permissioned, portable, and model-neutral. It should not be buried inside the same vendor that sells you the intelligence and the workflow surface. Claude Tag is useful. That is why it is dangerous. Rent the intelligence, but own the context. Or, regret later.
Claude@claudeai

Introducing Claude Tag, a new way for teams to work with Claude. In Slack, Claude joins as a team member with access to the channels and tools you choose. Tag Claude in and delegate tasks to it while you focus on other work.

English
476
528
5K
1.5M
Taylor Eernisse đã retweet
Kenton Varda
Kenton Varda@KentonVarda·
I actually think this is the wrong approach to agent authorization. Here's why: If you have to explicitly configure each agent's permissions, you've lost. Because you're only going to have patience to configure so many agent permissions. So in this route you can only have a certain relatively small number of agents before configuration fatigue prevents you from making more. I don't think that's what we want. I don't think that's what's good for AI safety. What we want is an enormous number of very fine-grained agents. Each task is a new agent. And each task has exactly the permissions needed for that task, no more, no less. There's really only one known way to make that manageable: Capability-based security. The basic idea is, when you give the agent a task, you naturally give it the capabilities it needs to perform that task. Like say you want an agent to review a Google Doc. Today, with a lot of AI assistants "Hey go review the document titled Foo Spec". The agent has permissions to all your docs, so it goes and finds the right one and opens it. That's wrong. You should say "Hey, go review this document: " And then here's the key part: The harness should see that you're pasting a URL, and should infer that you want to give the agent access to that document. Only that document. No other document. Importantly, you didn't really have to do anything unusual to configure this. You just pasted the URL of the thing you wanted the agent to access. Which you probably would have done anyway. Sure, it's not always that easy. Maybe you commonly run agents that need access to 10 different things, and it's tedious to paste those 10 URLs every time. So you create some sort of a bundle that you give them. And of course, the agent should be able to ask for extra things it needs. But we need to get away from this idea that the agent always starts out with access to everything, even when it doesn't need most of it. Also, I tend to think all agent authority has to derive from a human -- contrary to what is argued here. Every "autonomous agent" has to report to someone, and uses a subset of that person's authority. This is needed for accountability -- because agents are not accountable. If you see "Claude deleted the database", what are you supposed to do about that? You need to see "Claude acting on behalf of Bob deleted the database". To be clear, I totally agree that it's problematic when Alice configures an agent with her own credentials and then Bob tells the agent to do something with those credentials. Then you'll see "Claude acting on behalf of Alice", but actually Claude was acting on behalf of Bob. The answer is that Bob should not be able to command Alice's agent. Bob has his own agent, which may have all the same context, but operates with Bob's credentials. But if Alice sets up an agent for her team, Bob maybe doesn't want to spend time configuring his own version of it with all the same credentials, that's tedious. This has to be automated. I think capabilities make this easier. Alice gave a set of capabilities to her agent. The harness should be able to look at that list, and recreate the same list using Bob's credentials, without Bob having to do much except click "OK". I realize there's a lot of hand-waving here -- this is a complicated topic. I'll drop some code next week. If you're attending AI Engineer in SF, come to my talk on Tuesday, where I will also only be able to scratch the surface in 20 minutes... claude.com/blog/agent-ide…
English
86
57
685
84.6K
Taylor Eernisse
Taylor Eernisse@theirongolddev·
this was on an account I've been driving all day without issues, and it happened in the middle of running work, not after resuming a session with expired cache
English
0
0
0
14
Taylor Eernisse
Taylor Eernisse@theirongolddev·
@trq212 @AnthropicAI @bcherny I just watched my weekly usage jump from 78% to 100% in an instant on a Max 20 plan. My session usage remained steady at 16% - what's going on?
English
1
0
0
26
Darin
Darin@darin_gordon·
@jxnlco Every single time I've let the cart get away from me, it resulted in a disaster
English
1
0
2
131
jason
jason@jxnlco·
if you can easily answer 'what are you working on' you're not using agents enough.
English
79
34
586
40.5K
Taylor Eernisse
Taylor Eernisse@theirongolddev·
@mjovanovictech Also reliability and availability - both are often of higher importance over time than better results.
English
0
0
0
42
Milan Jovanović
Milan Jovanović@mjovanovictech·
This is two years worth of Claude 20x Max (at today's price). And you get far better results with Opus/Fable. Local LLMs are not worth it with today's hardware. Unless you can afford a $500k-1M home data center. Privacy story is a different concern.
corbin@corbin_braun

for the small price of $4,679 I will never need to hire an employee again. you are undervaluing whats possible with local llm truly. anything a virtual assistant can do, a NVIDIA DGX Spark can do. at the cost of electricity.

English
95
10
370
74.5K
Taylor Eernisse
Taylor Eernisse@theirongolddev·
I’ve been having a ton of fun setting this up the past couple of weeks. I have a custom bridge running on my local machine that scans for tagged issues and spins up Claude or Codex agents inside herdr panes locally to do work. If the work doesn’t touch critical paths it gets automerged on green CI, otherwise it parks for my review. The visibility is the real gain here. Being able to see work as it’s done autonomously is something I think lots of people downplay too often. Linear functions really well as a control plane!
English
0
1
0
191
Karri Saarinen
Karri Saarinen@karrisaarinen·
The software factory is humming along in @linear. An issue comes in from Intercom to Triage. Linear routes and categorizes it, investigates, opens a coding session, and has a fix ready for review 10 minutes later. We’re now merging 50 to 70 Linear agent generated fixes per week just from Triage. (Usage is priced with clear token-based credits. No monthly AI fees, no expiring credits.)
Karri Saarinen tweet mediaKarri Saarinen tweet media
English
38
13
512
46.6K
Taylor Eernisse
Taylor Eernisse@theirongolddev·
@jaminlabs @mattpocockuk If you’re asking you haven’t used opus enough yet They’re both really good and each has its place, but I really do not want codex running my automations. Claude code’s harness and the Claude models are just better at the kinds of work you typically use an AFK agent for
English
0
0
3
84
Jamin
Jamin@jaminlabs·
@mattpocockuk Isn’t codex enough? I’m genuinely curious why people favor opus
English
3
0
3
887
Matt Pocock
Matt Pocock@mattpocockuk·
Anthropic has opted against slashing AFK/third-party usage, at least for now Go back to sleep, Codex, I don't need you yet
Matt Pocock tweet media
English
68
22
765
70.1K
Taylor Eernisse
Taylor Eernisse@theirongolddev·
@HoochNScooch @TLM_Ryan The thing about “equality” is that it has to be measured which means you have to be tracking it which means nobody is doing anything simply because it needs doing but because they’re trying to perform to some external metric that does nobody any good
English
1
0
17
4.5K
Cajun Raisin
Cajun Raisin@HoochNScooch·
@TLM_Ryan Stop treating your marriage like a zero sum math problem. Holy fuck, just do dishes because they’re dirty, without keeping score. I’ve been married 20yrs and I feel like I hear advice like this from guys who’ve had 2 divorces in 10 years.
English
8
4
613
59K
TLM Ryan 📊 ☧
TLM Ryan 📊 ☧@TLM_Ryan·
My marriage was never worse than when we were near equal on this metric. Scope creep due to some bad post-partum. One day I just stopped. I reverted to stereotypical gender tasks and nothing more. Relationship resolved almost over night.
The Institute for Family Studies@FamStudies

Modern dads are helping out at home more than ever, @lymanstoneky finds. Married dads of young children in 1965 did, on average, less than 10 hours a week of child care or help around the house. Dads in 2024 contributed nearly 30 hours a week, an increase of approximately 300%.

English
123
63
2.4K
1.7M
Taylor Eernisse
Taylor Eernisse@theirongolddev·
@beherleader If this is normal in modern marriage no wonder so many end in divorce Can’t hope to grow a relationship when your barbs are always facing towards each other instead of in the same direction - outward against all who seek to divide you, and inward against your own selfishness
English
0
0
3
204
Will Knowland
Will Knowland@beherleader·
What's wrong with modern marriage... Wife: You’re doing the avoidant thing again. Husband: And you’re being anxious and dysregulated. Wife: I’m not dysregulated. I’m reacting to your attachment wound getting triggered. Husband: No, you’re projecting. My nervous system is shutting down because your energy feels unsafe. Wife: There it is. You always say that when you want to escape accountability. Husband: I’m not escaping accountability. I’m setting a boundary because you’re activating my childhood trauma. Wife: Your childhood trauma is not an excuse for stonewalling me. Husband: And your abandonment wound is not an excuse for controlling me. Wife: I’m not controlling you. I’m asking for connection. Husband: It doesn’t feel like connection. It feels like criticism. Wife: Because you hear everything as criticism when your attachment system is threatened. Husband: And you hear every pause as rejection because you haven’t worked on your insecurity. Wife: Wow. So now my insecurity is the problem? Husband: I feel disrespected right now.
English
19
7
146
17.5K
Taylor Eernisse
Taylor Eernisse@theirongolddev·
@seconds_0 People need to buy their own server infra and run it locally. I’m super glad I bought my Dell r730xd when I did a couple years ago! And with AI it’s never been easier to jump into self hosting your own apps and services!
English
1
0
0
542
Taylor Eernisse
Taylor Eernisse@theirongolddev·
Herdr is pretty great too. They’re all superior right now to any desktop app. The biggest problem with the desktop apps imo is how they’re optimized under the hood to behave the “official” way and capture my chat data/memory/etc in ways the cli tools aren’t able to. Lets me control my own destiny a little bit more. Plus none of them have great UIs anyway; features are missing, inconsistent, or janky. It’s already a tremendous pain to manage the schema differences for skills, memory systems, hooks, agents, mcps, and instruction files (CLAUDE.md why…). I don’t want to have to learn the different desktop app idiosyncrasies on top of that while ALSO having to figure out how to port all my cli tools/hooks/skills/agents into the stupid desktop apps too.
English
0
0
0
50
AJ Stuyvenberg
AJ Stuyvenberg@astuyve·
I spent a few months using codex/claude/opencode desktop apps but I've mostly given up and gone back to the terminal. tell me why I'm wrong/what I'm missing
English
40
1
53
17K