Chris Raethke

4.6K posts

Chris Raethke banner
Chris Raethke

Chris Raethke

@codesoda

Improving communication using product thinking and machine learning. Prev @NotivHQ @bugcrowd. Purveyor of bad dad jokes! #peoplefirst

San Francisco, CA Katılım Aralık 2008
992 Takip Edilen2.6K Takipçiler
Matt Shumer
Matt Shumer@mattshumer_·
Agents that natively self-orchestrate, managing their own context, tools, and sub-agents, are the next big unlock in LLM performance. Right now, a skilled engineer building an optimized harness, with thoughtful data flow, separation of concerns, sub-agent management, etc., can make dramatic improvements over baseline for specific tasks. If a model could do this itself, that’d be a major step forward. You give it an objective and a set of tools, and it figures out the optimal way to orchestrate itself to do the task. For example, I’m building a very primitive AI scientist that I’ll open-source soon. Most of the work isn’t in the prompt, it’s in the harness… what the orchestrator sees, what sub‑agents see, what gets shared between them and when, where we summarize vs. pass raw data, and which tools each agent controls. Doing this allows me to dramatically improve what the model can do on its own. If a model can effectively design its own harness for a given problem, it’d be a huge step forward. My bet: self-orchestrating models… ones that manage their own context, tools, and sub-agents, will move the frontier almost as much as the jump from chatbot → reasoning did. Maybe more.
English
51
18
316
67K
Chris Raethke
Chris Raethke@codesoda·
Your coding agent accumulates patterns you never notice. Skillable mines your session transcripts to find repeating workflows and turn them into skills. Works with Claude Code, Codex, OpenCode + 9 more. github.com/codesoda/skill…
English
0
0
2
83
Zara Zhang
Zara Zhang@zarazhangrui·
How I build skills: 1. Brute-force a task with Claude Code 2. Iterate until it meets my criteria 3. Get Claude to read this folder, where I pre-downloaded blogs & docs from Anthropic team, and turn what we just did into a skill 4. Eval the skill using skill-creator plugin
Zara Zhang tweet media
Thariq@trq212

x.com/i/article/2033…

English
11
24
262
22.7K
Chris Raethke
Chris Raethke@codesoda·
I worked with GPT 5.4 to brainstorm, then codex and I wrote a PRD (see tasks). That was then strategically broken up into detailed specs using my auto-ralph skill, which I reviewed meticulously. Now auto-ralph is [planning, buidling, reviewing]+ through each spec to build it out.
English
1
0
2
74
Chris Raethke
Chris Raethke@codesoda·
Currently ralping a new thing AI first bookmarking - What's interesting is not so much in what it is, but how it's coming into fruition. github.com/codesoda/agent…
English
1
0
2
52
Chris Raethke
Chris Raethke@codesoda·
@camhahu In theory, but I feel the value (and current need) is more on the vibe-coded end of the AI maturity spectrum
English
1
0
1
17
Cameron
Cameron@camhahu·
@codesoda time to rename to agentic-engineering-audit? 🤔
English
1
0
0
55
Chris Raethke
Chris Raethke@codesoda·
@camhahu A vibe code audit come up with this scope. My ralph PRD skill broke it down into specs. Now ralph is building it all planning, implementing, reviewing and pushing step by step to github.com/codesoda/vibe-… This will be better code and tests than I could dream of being able to write
English
1
0
1
33
Cameron
Cameron@camhahu·
what if humans only reviewed changes to the CI, not the code? we decide what decides what passes feels scary but directionally correct
English
1
0
2
75
Chris Raethke
Chris Raethke@codesoda·
@camhahu Even better with proper embeddings, do you have a log you can DM? I can have a look.
English
1
0
1
12
Cameron
Cameron@camhahu·
@codesoda looks like mine fell back to just bm25 and it's not clear why - but even with that, these suggestions are really good
English
1
0
0
21
Cameron
Cameron@camhahu·
every great engineer I know who isn't token-maxing right now dislikes AI because it make you worse before you figure out how to use it right
English
1
0
3
108
Cameron
Cameron@camhahu·
@codesoda what have you been doing for audits? particularly on stuff where you're doing little/no manual review we've looked at some stuff like cyclomatic complexity and module boundaries but no silver bullets
English
1
0
0
17
Chris Raethke
Chris Raethke@codesoda·
@Al_Grigor If you are using terraform then make sure you have a thorough way to test your setup, and make sure it's documented in your AGENTS.md. I'm biased but the best way to do this is with my terra-agent skill from the x-agents project. github.com/codesoda/x-age…
English
0
0
0
31
Alexey Grigorev
Alexey Grigorev@Al_Grigor·
Claude Code wiped our production database with a Terraform command. It took down the DataTalksClub course platform and 2.5 years of submissions: homework, projects, and leaderboards. Automated snapshots were gone too. In the newsletter, I wrote the full timeline + what I changed so this doesn't happen again. If you use Terraform (or let agents touch infra), this is a good story for you to read. alexeyondata.substack.com/p/how-i-droppe…
Alexey Grigorev tweet media
English
1.5K
1.6K
11K
4.1M
Chris Raethke
Chris Raethke@codesoda·
@camhahu Or running vibe code audits over night, so I can then work on improvements the next day.
English
1
0
1
18
Chris Raethke
Chris Raethke@codesoda·
@camhahu I get to the end of the day and "oh what! I have tokens left before I'm rate limited, time to fire off some ralph loops on some smaller lower priority work items"
English
2
0
1
31
Chris Raethke
Chris Raethke@codesoda·
@camhahu Love this, andressen had a similar comment on his recent interview. Engineers do so much more than punching on a keyboard.
English
1
0
1
9
Cameron
Cameron@camhahu·
think of something you want to build, the fastest way is with an LLM, but the LLM still needs you therefore: coding is solved, software isn't
English
1
0
6
92
Garry Tan
Garry Tan@garrytan·
Here's my process: Spot a problem while using the product (or have a feature idea) Describe the architecture to Claude in enough detail that the first draft is 80% right Review the output, catch the subtle bugs (ordering of side effects, race conditions, security issues) Iterate 2-4 times within the same PR (your PRs average 3-5 sub-commits, each fixing something the previous round missed) Ship it — version bump, changelog, merge Do you guys follow this or do you do something else? I'm curious
English
176
58
1K
86.5K
Chris Raethke
Chris Raethke@codesoda·
You know what would be nice, different permission profiles inside a @claudeai project, e.g. I'm in dev profile, so that has a bunch of skills, tools, etc., enabled. I'm in the AWS/DevOps profile, so I enable all the default things for that. But in general, it's locked down.
English
0
0
0
73
Chris Raethke
Chris Raethke@codesoda·
@gdb I'd love a way to skip the sandbox when launching though, I'm often doing a bunch of ops type things which means actually making mods to the local machine.
English
0
0
0
140
Zara Zhang
Zara Zhang@zarazhangrui·
Seems to me that every new AI product is trying to grab the attention of a very small group of elite, AI-savvy users who are completely bombarded and tired of their marketing tactics While at the same time normal people who are outside the AI circle are completely oblivious to the latest developments in AI & only using it as a google replacement
English
71
11
236
15.6K
tanmaye
tanmaye@tanmaye_bhatia·
@nicbstme Interesting! I think there needs to be step function to have dynamic and reliable UIs in agent windows. They’re not good at this right now. If you want to action on data, buttons and clicks are more reliable that md files. I wonder if any companies are building here
English
1
0
1
237