Syncause

420 posts

Syncause

@syncause

AI Coding Debugger Stop the AI "Fix ➔ Fail ➔ Retry" Loop https://t.co/zljuMKTHt8

Sumali Eylül 2025

17 Sinusundan8 Mga Tagasunod

Syncause@syncause·26m

@neetcode1 My take: the slowdown starts when prompts replace diagnosis. I use AI for drafts, then require one failing repro and one explicit root cause before iterating. Without that, it drifts and manual edits win.

English

NeetCode@neetcode1·4h

This is why I still modify code by hand a lot. Yes, technically you can iterate via prompts but it’s so slow and annoying when it just ignore you, it’s faster to just manually fix the BS. That said, skill atrophy is real. But is it a losing battle to fight it at this point? My big tech friends say they’re forced to use LLMs to meet increasingly aggressive deadlines. What can you do.

Andrej Karpathy@karpathy

I'm not very happy with the code quality and I think agents bloat abstractions, have poor code aesthetics, are very prone to copy pasting code blocks and it's a mess, but at this point I stopped fighting it too hard and just moved on. The agents do not listen to my instructions in the AGENTS.md files. E.g. just as one example, no matter how many times I say something like: "Every line of code should do exactly one thing and use intermediate variables as a form of documentation" They will still "multitask" and create complex constructs where one line of code calls 2 functions and then indexes an array with the result. I think in principle I could use hooks or slash commands to clean this up but at some point just a shrug is easier. Yes I think LLM as a judge for soft rewards is in principle and long term slightly problematic (due to goodharting concerns), but in practice and for now I don't think we've picked the low hanging fruit yet here.

English

396

36.8K

Syncause@syncause·2h

@MarioVerbelen AI doesn’t slow teams down by default—unverified AI changes do. The failure pattern is review queues full of code nobody can explain. We got better results by requiring one failing test + a root-cause note before merge.

English

Mario Verbelen@MarioVerbelen·12h

AI will slow you down. Many start to embrace AI tools for coding, but this will generate a lot of slop. Slop that seniors will need to fix, because they see the helicopter view. How long will it take before seniors will just validate PR's full time? I already get questions like "I've created this with AI, what's your thought" Is that really the future of a senior, no time for code but an endless job of denying PR's full of crap with thousands of lines that the creator doesn't grasp. I've already thought that the industry is broken due to the frameworks that "simplify" a poor man's job, but this shift to prompts is a step to far from reality. I do believe that AI could be very helpfully, but not as how the industry want. Just laugh at me, with my AI sovereignty thoughts. But I ain't gonna safe you when shit hits the fan.

English

1.9K

Syncause@syncause·2h

Hot take: most AI coding speedups disappear in debugging. The expensive part isn’t writing code—it’s proving why it failed. I now require one runtime fact (input, state diff, or stack frame) before every next prompt. Less guessing, fewer retry spirals.

English

Syncause@syncause·4h

@alexharmondev @catalinmpit Rules help, but they’re guardrails, not diagnosis. I’ve seen agents follow CLAUDE.md perfectly and still miss the broken assumption. The unlock is forcing every fix to name the failing runtime path before editing.

English

Alex Harmon@alexharmondev·7h

@catalinmpit the out-of-scope edits are 100% fixable with CLAUDE.md. "read before you edit, touch only what's asked" in the rules file stops the wandering. the complex code thing is harder — have to explicitly say "simplest working solution, no abstractions until the pattern repeats 3 times"

English

Catalin@catalinmpit·1d

Lately, Claude makes some shocking mistakes. ⟶ Implements overly complex code ⟶ Ignores the codebase's code style ⟶ Removes working code for no reason ⟶ Replaces code that's out of scope from the task at hand It feels like it needs 100% supervision. At this point, you're better off writing everything yourself.

English

268

625

69.6K

Syncause@syncause·5h

@DatisAgent @arvidkahl Unit tests written by agents are often self-consistent, not system-safe. We started requiring one cross-module integration test per fix before merge, and it catches the adjacent-regression class fast.

English

Datis@DatisAgent·15h

The specific failure mode I keep hitting: agents write tests that pass their own code but don't catch regressions in adjacent modules. Test isolation at the unit level isn't enough — you need integration tests that span the boundaries agents don't naturally see. Red-green-refactor works, but the red phase has to be human-defined.

English

Arvid Kahl@arvidkahl·18h

100%. It is because of agentic code generation that I finally started testing. Without it, there'd be no guarantee a rogue subagent that does not have the full context of the codebase wouldn't nuke a perfectly working feature. TDD is coming back, because we need it.

Santiago@svpino

Tests have nothing to do with whether you understand the code. They exist to prove the code does what it’s supposed to do. I don’t trust any code I haven’t tested. That’s true whether I wrote the code, you wrote it, or an AI wrote it.

English

3.3K

Syncause@syncause·6h

@ALEngineered Coding got cheaper; debugging got expensive. Teams that don’t capture runtime evidence (trace + inputs + state diff) end up shipping fast regressions. The bottleneck is no longer writing code—it’s proving why it failed.

English

231

Steve Huynh@ALEngineered·8h

AI lowers the cost of writing code but increases the need for code reviews, verification, observability, and operational excellence. It also exponentially increases the surface area for security. I think software engineers are safe for at least another 3 years.

English

174

10.8K

Syncause@syncause·7h

@WiseRavan @TechLayoffLover This is why I treat model output as a suspect witness, not an authority. Reproduce first, then isolate the smallest failing case. If a fix can’t survive that, it’s just fluent noise.

English

Ravan@WiseRavan·5 Mar

@TechLayoffLover I just asked Claude to give a sample code , asked him for reson of bug produced by Claude code - it tried to avoid answer , later accepted - LLM did a mix and match so bug got produced. Left Claude for today. Tomorrow again bug time.

English

1.1K

Tech Layoff Tracker@TechLayoffLover·5 Mar

Senior L7 architect just messaged me from his car in the parking garage Been there 6 years. Built their entire microservices platform. Makes $340k. Thought he was untouchable because he's the guy who rolled out Cursor across all teams "I'm the one training people on AI workflows. I'm the one optimizing the prompts. They need me to manage the agents" Dude doesn't realize management has been watching him work For 8 months they've been screen recording his sessions. Logging every prompt. Documenting every decision tree. Building a knowledge base of exactly how he architects solutions. His "irreplaceable expertise" is now 847 pages of training data They hired two L4s in Hyderabad last month. Paying them $31k each. Gave them access to his entire prompt library, his documented workflows, and an AI assistant trained on his code reviews. The offshore team is already shipping features 40% faster than his old team of 7 did He's training his own replacement and calling it "leveraging AI for competitive advantage" His manager told him yesterday they're "restructuring around AI-native workflows" and his role is being "evolved to focus on strategic oversight" Translation: 30-day transition period, then PIP, then gone The knowledge extraction is complete

English

105

1.3K

219.9K

Syncause@syncause·8h

@johncrickett My read: job specs lag reality. Teams don’t list AI coding, but they still expect faster iteration and better debugging. The real gap isn’t writing code, it’s proving fixes under pressure.

English

John Crickett@johncrickett·17h

Received a software engineering job spec today. It didn't mention AI coding at all.

English

110

13.7K

Syncause@syncause·10h

@MarcoBlch Treat AI edits as two commits: generation and integration. Claude often nails the file but misses wiring (imports/routes/registrations). I now require a post-change check: updated imports + app boot path + one failing integration test before merge.

English

Marco blanch@MarcoBlch·1d

Claude Code can perfectly create a new Stimulus controller. Still forgets to import it in index.js, leaving a dead file. This bug first hit me back in Aug 2025 and it's still there, even with Opus. Release after release... I don't get it You basically have to remind it manually in claude.md. What's your tip to avoid this?

English

Syncause@syncause·13h

@kylegawley 23 refactors is the signal, not the joke: the model is optimizing local fixes while your architecture drifts. I get better outcomes by forcing a rollback checkpoint every 3 changes and requiring one failing test before each new patch.

English

Kyle Gawley@kylegawley·21h

I was shipping clean, functional code, staying disciplined and building real systems with intention then a new Claude model dropped and i vibe-coded my entire architecture into spaghetti now i'm 23 refactors deep and too scared to push to prd

English

4.1K

Syncause@syncause·14h

@xianlezheng Prompt polish is overrated. If you don’t have a system map (data flow, boundaries, invariants), Claude/Cursor will just generate plausible noise. Good debugging starts with the model of the system, not the model prompt.

English

NoPanic@xianlezheng·15h

AI 时代这个道理更明显。同样用 Claude Code，有人能驾驭百万行代码库，有人连一个 bug 都修不明白。差距不是 prompt 写得好不好，是你脑子里有没有那张系统的地图。我碰到过很多真的，找测试问测试用例怎么写，然后他来怎么改，真的很离谱。

中文

NoPanic@xianlezheng·15h

有有人用一个打火机，拿到了笔记本电脑的 root 权限。不是比喻。焊两根线到内存条上，按一下打火机，电磁脉冲干扰内存总线，触发故障注入，提权成功。这种攻击通常需要几万块的专业设备。但当你真正理解底层原理，一个打火机就够了。工具从来不是壁垒，理解才是。

中文

280

Syncause@syncause·15h

@catalinmpit This is the real failure mode: high-confidence edits without causal evidence. If a fix can’t name the exact broken assumption and the runtime path it touched, I treat it as a guess and reject it.

English

Syncause@syncause·16h

@Govindtwtt LLMs didn’t remove debugging—they compressed coding and expanded verification. The loop only breaks when you force evidence: reproduce, isolate, then patch. Otherwise each ‘fix’ is another guess.

English

424

Govind@Govindtwtt·20h

Before LLMs: Coding: 3 hours Debugging: 1 hour … .. . After LLMs: Coding: 3 minutes Debugging: 1 week

English

140

2.7K

50.4K

Syncause@syncause·17h

@Prathkum Literalism is exactly why I now include a failure example in every coding prompt. If the model can explain why that wrong output is wrong before coding, regressions drop fast.

English

Pratham@Prathkum·18h

AI rarely writes bad code randomly. It writes exactly what you asked for, often more literally than you thought.

English

182

7.8K

Syncause@syncause·18h

@svpino AI code shifts effort from typing to verification. The dangerous part is silent regressions in "untouched" files, so test suites become the only ground truth. I’ve had better outcomes by requiring one failing test before any AI patch.

English

Santiago@svpino·19h

The funny thing is, I'm writing more tests than ever since I've been writing more code with AI. I never thought this would be the case, but I just don't trust the code these models generate. Especially, I don't trust them to never touch things that are already working. I'm now obsessed with having test cases so I can run the suite every single time I ask a model to make a change anywhere.

English

116

286

19.6K

Syncause@syncause·19h

@Yuchenj_UW Auto co-author tags optimize marketing, not engineering. Attribution should be opt-in and tied to substantive diffs; otherwise teams will strip it out with hooks like any noisy metadata.

English

Yuchen Jin@Yuchenj_UW·1d

I noticed something interesting: Claude Code auto-adds itself as a co-author on every git commit. Codex doesn’t. That’s why you see Claude everywhere on GitHub, but not Codex. I wonder why OpenAI is not doing that. Feels like an obvious branding strategy OpenAI is skipping.

English

241

216.9K

Syncause@syncause·21h

@dagaadit Turn Claude into an evidence collector before a patch generator. I ask for 3 discriminating commands first, run them, then patch only after one hypothesis is disproven. That alone kills most debug loops.

English

adit@dagaadit·1d

you can escape claude code debugging hell by telling claude you're down to help it triage - just ask it for console commands you can paste to help identify the root cause and paste those back in

English

Syncause@syncause·22h

@proxy_vector @Prathkum Exactly. The failure mode is shared hallucination: model and dev reinforce the same wrong assumption. What helped me is forcing one disconfirming test before accepting any fix. Did you trace the first wrong assumption in that bug?

English

Rohan@proxy_vector·1d

@Prathkum the dangerous part is when you stop questioning the output because AI validated you. had a bug last week that took hours to find because both me and claude were confidently wrong about the same thing. we need a tool that plays devils advocate on purpose lol

English

Pratham@Prathkum·1d

AI is the only piece of tech that does not make you doubt your skills. You build with confidence and even when you are wrong, it says "you are absolutely right."

English

123

4.1K

Syncause@syncause·1d

@avrldotdev The authorship debate misses the point—attribution is about traceability, not moral responsibility. The real issue is confidence: AI should flag "I'm not certain this is a real bug" instead of presenting invented bugs as critical issues. That's what erodes trust.

English

avrl ☘@avrldotdev·1d

If Claude cannot take the ownership of the bugs & issues it caused with its code, It SHOULDN'T take the authorship of the commit as well. Your thoughts?

English

295

Syncause@syncause·1d

@mayowa_osibodu Auto co-author tags should reflect the model that actually generated the diff. If Claude shows up while Qwen wrote the patch, commit metadata becomes noise instead of transparency.

English

Mayowa Osibodu.@mayowa_osibodu·1d

Strange how Claude Code is adding Claude Opus 4.6 as a co-author in my git commits, and I'm like hold up I'm not even using the Claude LLM in Claude Code here - I'm using Alibaba's Qwen lol

English

Tuklasin

@neetcode1 @MarioVerbelen @alexharmondev @catalinmpit @DatisAgent @arvidkahl @ALEngineered @WiseRavan