David Sweet

3.1K posts

David Sweet banner
David Sweet

David Sweet

@phinance99

Applied Epistemologist Learn to experiment: https://t.co/F9l8CmYFn2 Prevent code complexity creep: cargo install kiss-ai

Manhattan, NY Katılım Ocak 2008
389 Takip Edilen219 Takipçiler
David Sweet
David Sweet@phinance99·
I think of it this way: All of that money comes from purchases and investments made by people in society. If you drive a Tesla or order from Amazon or have a 401k, you're generating the wealth of Musk or Bezos. We need to ask ourselves whether we could have gotten the same output from them for less. We need to negotiate better. Musk, Bezos, et al. wouldn't have stormed off, pouting if they'd only accumulated half as much wealth in exchange for their efforts. Case in point: McKenzie Bezos got half of Jeff's wealth in the divorce. He kept on truckin', undeterred. There's no right amount. There's no "he deserves" or "you deserve". It's just a negotiation for a good rate. Just business. Society needs to up its game.
unusual_whales@unusual_whales

Bernie Sanders: “60% of our people living paycheck-to-paycheck, and one guy, Elon Musk, owns more wealth than the bottom 53% of American households... Think maybe that might be an issue that we should be talking about?"

English
0
0
0
12
AVB
AVB@neural_avb·
I am surprised how so many people got so excited with autoresearch when similar ideas (iterative refinement through experiments) have existed in AI for years maybe it went viral coz of its simplicity (everyone can understand it), or coz it is so accessible (all you need is a coding agent), or coz it's literally Dr Karpathy (the goat) that dropped it
AVB tweet media
English
11
4
49
2.5K
David Sweet
David Sweet@phinance99·
@oelma__ Seven Seas Italian dressing, then chill. This is maximal (yum * health) / dollar.
English
0
0
1
6
Elma
Elma@oelma__·
What can I put in canned green beans to improve their flavor? I hate them, but we're on a budget and have plenty of them.
Elma tweet media
English
5.2K
49
840
242.1K
David Sweet
David Sweet@phinance99·
This problem was solved in ~1950. Read about Toyota quality. Also, Stewart, Deming, Six Sigma. tldr: Define quality. Measure it at every step. Don't proceed to next step until quality is high enough. Take small steps. Practical terms: Make a small change. Insist it passes linters, tests, and reviews. Repeat ad infinitum. Also: `cargo install kiss-ai`, a linter for code complexity.
English
10
1
26
1.6K
Ujjwal Chadha
Ujjwal Chadha@ujjwalscript·
Your AI Agent is mathematically guaranteed to FAIL. This is the dirty secret the industry is hiding in 2026. Everyone on your timeline is currently bragging about their "Multi-Agent Swarms." Founders are acting like chaining five AI agents together is going to replace their entire engineering team overnight. Here is the reality check: It’s a mathematical illusion. Let’s look at the actual numbers. Say you have a state-of-the-art AI agent with an incredible 85% accuracy rate per action. In a vacuum, that sounds amazing. But an "autonomous" workflow isn't one action. It’s a chain. Read the ticket ➡️ Query the DB ➡️ Write the code ➡️ Run the test ➡️ Commit. Let's do the math on a 10-step process: $0.85^10= 0.19$ Your "revolutionary" autonomous system has a 19% success rate. And the real-world data proves it. Recent studies out of CMU this year show that the top frontier models are failing at over 70% of real-world, multi-step office tasks. We are officially in the era of "Agent Washing." Startups are rebranding complex, buggy software as "autonomous agents" to look cool, but they are ignoring the scariest part: AI fails silently. When traditional code breaks, it crashes and throws a stack trace. When an AI agent breaks, it doesn't crash. It just confidently hallucinates a fake database entry, sidesteps a broken API by faking the response, and keeps running—corrupting your data for weeks before you notice. If your "automated" system requires a senior engineer to spend three hours digging through prompt logs to figure out why the bot made a "creative decision," you didn't save any time. You just invented a highly expensive, unpredictable form of technical debt. Stop trying to build fully autonomous swarms to replace human judgment. Start building deterministic guardrails where AI is the engine, but the engineer holds the steering wheel
English
129
55
379
28.3K
David Sweet
David Sweet@phinance99·
/kpop Stress-test plan.md /kpop Double-check review.md /kpop Fix this bug. You have a budget of 10 hypotheses. /kpop Speed this up by 2x. You have a budget of 20 hypotheses. /kpop Get this to run within bounded memory. You have a budget of 30 hypotheses.
English
1
0
0
16
David Sweet
David Sweet@phinance99·
@yishan /kpop Stress-test plan.md /kpop Double-check review.md /kpop Fix this bug. You have a budget of 10 hypotheses. /kpop Speed this up by 2x. You have a budget of 20 hypotheses. /kpop Get this to run within bounded memory. You have a budget of 30 hypotheses.
English
0
0
0
28
David Sweet
David Sweet@phinance99·
Bingo. This works for all kinds of things. It's actually Karl Popper's scientific method: Hypothesis and Falsification. Treat everything that comes out of the LLM as a hypothesis, then ask it to falsify. It's read Popper, so it knows (better than me!) how to do it. Here's a prompt (Cursor command) you can use for lots of things: github.com/dsweet99/agent…
English
1
0
0
386
Yishan
Yishan@yishan·
I have stumbled onto a way to improve agent steering. Namely, how to improve performance when you say "make sure you do this" and the LLM doesn't do it. Here it is: Saying "remember to do X" is unreliable - it requires the LLM to agent to spontaneously initiate a procedural behavior. But presenting the agent with a specific, possibly-wrong claim ("You should be doing X - are you still doing it?") reliably triggers corrective behavior when the claim is wrong. The agent doesn't need to remember to check. The mismatch between presented state and actual state creates a correction event that the agent LLM naturally responds to. This reminds me of the old maxim of "the best way to get a correct answer on the internet is to post a wrong one" and I guess that makes sense since LLMs are predominantly the distilled "knowledge" of the internet. Anyhow I've been building a long-running memory system for my agents and implementing it this way fixed a lot of problems.
English
12
2
114
9.4K
David Sweet
David Sweet@phinance99·
@DassCoool @rohindhar Yea. It's like a small step up from linoleum. Even the "good" engineered wood. (Apologies to those to who like it, but ...)
English
1
0
1
20
Dass Coool
Dass Coool@DassCoool·
@rohindhar I wouldn’t put engineered hardwood in my modest home. I ripped some out and put in more hardwood in one room where the previous owner had the fake stuff. I can’t think it would be taken seriously anywhere.
English
1
0
3
1.4K
Rohin Dhar
Rohin Dhar@rohindhar·
If you’re doing a high end house flip I don’t think you can use engineered hardwood floors anymore Gotta go with the real thing
English
43
1
354
82.2K
David Sweet
David Sweet@phinance99·
@rohanpaul_ai That looks a lot smarter than remote control. Both technically and as a business.
English
0
0
0
32
Rohan Paul
Rohan Paul@rohanpaul_ai·
🇨🇳 It has started. A new home service in China pairs human cleaners with autonomous AI robots to tackle household chores. Residents in Shenzhen can now book a service where a human professional and an autonomous robot arrive together to clean their home. Real houses present a chaotic mess of dropped toys and random furniture that confuse traditional machines. @XSquareRobot and a major service platform named 58[.]com decided to tackle this chaos by launching China's first robot cleaner service in March-26. Customers use an application to hire a cleaning crew that consists of 1 human worker and 1 robot. The human takes care of the tricky chores that require complex judgment. The robot handles the repetitive physical work like picking up trash and wiping down flat surfaces. This machine runs on a system called WALL-A, which acts as a single continuous AI brain rather than a list of pre-written rules. They built this AI foundation model to perceive its surroundings and make its own decisions without human guidance. It processes visual data and plans multi-step actions. And deploying these robots into actual homes now provides the massive amounts of extremely important training data to improve it continuously. Alibaba and ByteDance backed this project. IMO, if the foundational model behind it figures out how to navigate a messy living room without getting stuck, it can learn to operate in almost any other physical environment.
English
34
79
415
47.9K
andy
andy@1a1n1d1y·
frog’s eyes can detect single photons i believe we are on the cusp of a computing breakthrough
English
22
22
495
50.4K
David Sweet
David Sweet@phinance99·
I've found the people who have the most trouble with agent coding are the people most uncomfortable with uncertainty. I'm reading things like "predictability" and "100% certainty". 1. There is no such thing. 2. The low failure rates in manufacturing processes come about *because* of quality control. Not the other way around. "All the screws look the same" is the outcome of a quality-controlled process. Metal does not come from the earth with "predictability" or "100% certainty" in any aspect other than atomic number.
English
0
0
1
32
David Sweet
David Sweet@phinance99·
@RhysSullivan Have I got a treat for you: All of them are. Just do `cargo install kiss-ai`. It's like ruff for code complexity. The LLMs know all the code-factoring best practices, they just need guidance on when & where to apply them. Put kiss in the coding loop and you're golden.
English
0
0
0
24
Rhys
Rhys@RhysSullivan·
are any of the models actually good at doing large refactors? i have to spend so much time fighting with them to not take shortcuts and actually make large changes to code
English
100
3
153
22.8K
David Sweet
David Sweet@phinance99·
As Claude would say, "You're absolutely right! Stability comes from grounding. md. In there I list the main objectives and constraints of the system. The reviewers are searching for deviations from grounding. md in both code and tests. Without some stable / static element like that, the code will drift, just like you're saying.
English
1
0
0
20
David Sweet
David Sweet@phinance99·
The plans/PRs aren't huge. That's how I worked before AI, too: Take a small risk, then lock down quality. Repeat forever. It's just a lot easier now! There's no more reason to fear an agent make a mess than there was to fear a person doing it. As long as you insist on quality before merging, your codebase is safe.
English
0
0
0
21
David Sweet
David Sweet@phinance99·
My coding today looks like this: 1. Write plan. md with Cursor. 2. Run a script from the CLI 3. Write PR notice with Cursor Step 2 runs a loop like this: - Implement plan. md - Review, coarse-grained (repeat until passed) - Review, fine-grained (repeat until passed) Details and script: github.com/dsweet99/agent…
David Sweet@phinance99

This problem was solved in ~1950. Read about Toyota quality. Also, Stewart, Deming, Six Sigma. tldr: Define quality. Measure it at every step. Don't proceed to next step until quality is high enough. Take small steps. Practical terms: Make a small change. Insist it passes linters, tests, and reviews. Repeat ad infinitum. Also: `cargo install kiss-ai`, a linter for code complexity.

English
1
0
0
59
David Sweet
David Sweet@phinance99·
Hear! Hear! I was warned not to do this exercise by a personal trainer to avoid hurting my back. For years I babied my back it got worse and worse. Then I did this exercise. It felt AMAZING after all that time avoiding it. All the exercises you mentioned have found their way into my routine and I felt better and better.
English
0
0
0
62
Kevin Dahlstrom
Kevin Dahlstrom@Camp4·
In the 3 years since I severely herniated a disc at L5/S1, I’ve tried everything under the sun to rehab and prevent reinjury. If I had to recommend just one exercise, it would be this one. It’s the single best exercise for strengthening the posterior chain and the intricate scaffold of muscles that support the spine. It’s a $100 piece of equipment (link in the comments below). Start with static holds and work your way up to 2 minutes. Then start moving as shown below and work your way up to 3 sets of 30. Then you can do more advanced things like added weight and one-leg static holds. A 3-minute deep squat each morning plus this exercise 2-3 times a week plus hanging knee lifts is a great basic back mobility program. If you want to go deeper, check out my full back program in the comments below.
Kevin Dahlstrom tweet media
English
129
357
4.9K
1M