PARK JUN WOO

148 posts

PARK JUN WOO

@park_jun_woo

Building Neuro-Symbolic Agents. Ratchet Pattern + Symbolic Feedback Loops. Making LLM agents actually finish the job.

Katılım Ekim 2024

20 Takip Edilen21 Takipçiler

PARK JUN WOO@park_jun_woo·9h

LLM reviewed 88 items as pass. Only 56 were actually correct. 36% false pass rate. Honestly — how are you trusting LLM output without a deterministic gate?

English

PARK JUN WOO@park_jun_woo·10h

@michael_c_law Writing mechanical verification code is the less painful path.

English

Michael Lawrence@michael_c_law·16h

"We suffer more in imagination than in reality," wrote Seneca, who never watched an agentic AI run unsupervised in production.

English

136

PARK JUN WOO@park_jun_woo·14h

@RhoRider AI generated the data. AI verified the data. Probabilistic checking probabilistic — that's not verification, that's compounding the gamble. One deterministic gate would have cost 30 minutes. Skipping it cost 2x the project.

English

Rho Rider@RhoRider·1d

It’s not just some theoretical big tech firms in the future who will be impacted by “Vibe Code Capitulation”. I literally just witnessed an example play out in real time with one of my clients… The client is a mid sized financial analytics company working with independent retailers. They bill hourly, but charge new customers ~$1500K one time fee to cover integration into their internal tools. They outsource to contracted devs for the integration work… Last week a new customer they were about to onboard declined to pay the onboarding fee, citing (exact quote from the Slack channel)…“I know your dev team are just going to stick it in Claude Code to do it in 10 minutes anyway” Now even with any AI code the Devs use for onboarding…there’s also some very customer-specific API & data reconciliation work + a thorough validation process, so I know it still takes a full day of “real” work at least. The account exec explained this, but eventually caved to the customer and agreed to cover 2/3 of the onboarding cost. The Dev was instructed to “do what you can” to get the onboarding done in half the hours as normal to save costs…+ they had a 2nd new customer at the same time. The dev finishes the integration, signs off on the data audit and my client runs & delivers the first report. Today, my client gets a call from the customer POC letting them know they presented the report to their exec board, and realized halfway through the data was completely wrong…they were obviously pretty pissed. Turns out the Dev set up an AI agent to do the data validation process for both new customers at once to save time, and never manually reconciled the feed. And the analyst who worked the report…who’s supposed to double check the data…just used his own AI report tool, which didn’t flag any issue from the wrong data set. Then they realize the other new customer’s report which user the same data integration process was wrong too Both customers had to have their data feeds reworked & debugged, reports manually re reran, and Invoices discounted to make up for it. …The whole episode probably cost my client 2x a typical onboarding + lost credibility. The future is now.

Rho Rider@RhoRider

Vibe Code Fatigue will inevitably lead to Vibe Code Capitulation…and a serious future problem for companies. The current environment of execs cutting developer resources, while also pushing for a high volume of AI code output per individual dev, do so at the sacrifice of quality assurance. Put simply…Developers are not given time to properly manually debug the mountain of code they’re now expected to produce. Ultimately this will lead to slop (potentially dangerous slop) pushed to prod. This compounds with a double edged sword…higher developer reliance on AI will diminish manual debugging skills over time. The end result to this trend is a mass rehiring of developers and increased manual QA skill training.

English

127

17.6K

PARK JUN WOO@park_jun_woo·14h

@Kirsten3531 LLMs don't average answers. They average reasoning patterns. That's a very different ceiling.

English

Kirsten@Kirsten3531·1d

Today I learned the internet has not achieved consensus on this question lol

Kirsten@Kirsten3531

My cousin is betting his career on "LLMs can never be more than the average of their training data" but I feel like that's a very 2024 take? Aren't we already past this in like, coding and math?

English

4.6K

PARK JUN WOO@park_jun_woo·14h

@keithwhor Let ratchet code watch for you. A deterministic gate that blocks broken commits — you don't have to babysit. The agent can go on side quests. It just can't merge them.

English

187

keith@keithwhor·1d

i still watch these models like a hawk because if you don't pay attention they'll go on these dopey side quests all over your codebase. this is the worst offender so far - "let me undo everything i just did"

English

227

13.8K

keith@keithwhor·1d

LMFAO. Opus 4.7 just wrote me a fix for a bug, wrote tests, wired it all, it all passed. Great. then it goes - same session; "now let me revert all my changes as a sanity check and verify that the tests no longer pass" BRO WHAT.

English

2.4K

163.6K

PARK JUN WOO@park_jun_woo·15h

@bindureddy The real 100x isn't faster execution. It's faster falsification. Breaking a physically impossible idea used to take weeks of debate. Now AI tells you in seconds. First principles thinking just became accessible to everyone. That's the real superpower.

English

Bindu Reddy@bindureddy·16h

The new super humans are the ones who can 100x themselves using AI They are realizing their dreams and experimenting at rocket speed The world is their oyster 🦪

English

139

6.3K

PARK JUN WOO@park_jun_woo·15h

@tszzl The reader isn't human anymore. It's the agent. One file, one concept — so the agent loads exactly what it needs into context. No more, no less. Agentic code style isn't about readability. It's about searchability.

English

PARK JUN WOO@park_jun_woo·21h

@jrswab Yes — when the feedback loop is symbolic, not vibes. Machine verifies. Machine advises. Human decides. That's how good software scales.

English

jrswab@jrswab·21h

@park_jun_woo *good* software?

English

jrswab@jrswab·22h

Code may be cheap now but software still isn't

English

2.1K

PARK JUN WOO@park_jun_woo·22h

@michael_c_law I will make it trustworthy enough to replace me.

English

Michael Lawrence@michael_c_law·1d

"What will you do when AI replaces you?" Probably get more done.

English

288

PARK JUN WOO@park_jun_woo·22h

The fatigue comes from letting the agent decide when it's done. Add a deterministic gate — tests, type checks, schema validation — that mechanically blocks broken code. The agent generates freely; the gate catches what slips through. That's a ratchet code. Once it passes, it never goes back.

English

Aish@AishwaryaDevv·2d

Am I the only one getting vibe coding fatigue? Building landing pages in 30 seconds was fun, but maintaining a complex codebase where half the logic was “vibed” into existence is an absolute headache. Feels like we traded 1 hour of typing for 5 hours of architectural debugging later. I’ve started manually writing core logic again so I actually know where the technical debt is hiding. Is anyone successfully managing large production projects with AI agents, or are we all just building disposable software?

English

357

1.5K

190.7K

PARK JUN WOO@park_jun_woo·22h

@tpritha03 Breaking a wrong idea and rebuilding from first principles used to take weeks and endless arguments. Now AI can tell you in seconds that your thinking doesn't hold up physically.

English

Tanisha Pritha@tpritha03·23h

I think the software industry is splitting into two paths. One side optimizes for speed: generate code fast, ship fast, recover from bugs fast. The other optimizes for understanding: architecture, reliability, observability, security, and actually knowing how the system behaves at scale. Long term, I think the second group becomes far more valuable.

English

235

PARK JUN WOO@park_jun_woo·23h

@vlad_mihalcea You tell it to make no mistake. It still does. You build a gate that won't let mistakes through. That's not prompting. That's a ratchet code.

English

Vlad Mihalcea@vlad_mihalcea·1d

Software development is the art of telling a LLM to Make No Mistake

English

3.2K

PARK JUN WOO@park_jun_woo·1d

10/10 An LLM is a remarkable generator. But 0.977² is 0.954. Anything less than 100% collapses under repetition. Generation can be probabilistic. Verification must be deterministic.

English

PARK JUN WOO@park_jun_woo·1d

9/10 Design ratchets deliberately. 1. Find the gaps — code without tests, APIs without schemas, data without types 2. Increase feedback density — not just pass/fail, but where, why, and what diverges 3. Insert deterministic gates between chained steps — reset the multiplication

English

PARK JUN WOO@park_jun_woo·1d

Why Coding Agents Work — and Why They Break 1/10 Same model. Hallucinates in web chat. Ships a 200-line feature in one shot inside a coding agent. The model didn't get smarter. The structure changed.

English

Keşfet

@michael_c_law @RhoRider @Kirsten3531 @keithwhor @bindureddy @tszzl @jrswab @elonmusk