Lee Moore

139 posts

Lee Moore

Lee Moore

@leegmoore

AI Dev Guy, Principal Engineer, Enabler of Agentic SDLC nonsense

Katılım Eylül 2008
93 Takip Edilen63 Takipçiler
Lee Moore
Lee Moore@leegmoore·
Where is the good GLM 5.2 inference these days? Fireworks? On z.ai coding plan, 5.2 is getting dramatic and crashing out and lost in neurotic thought loops. This wasn't happening a few days ago.
English
0
0
0
22
Lee Moore
Lee Moore@leegmoore·
@TechByTaraa if you only use one or the other, you are already hobbling yourself
English
0
0
1
1.3K
tara_
tara_@TechByTaraa·
I'm a Claude user. Give me one reason to switch to Codex
tara_ tweet media
English
309
13
731
206.8K
Lee Moore
Lee Moore@leegmoore·
Making great specs forces you to get clarity on what you want. Having great specs provide agents high signal instructions for build and test. They give the PR easy no-slop-gates before the human reviewer spends valuable attention on it.
dex@dexhorthy

If you feel like PR review / code review is the bottleneck in your system - figure out how to increase the odds that the code is 95% or 99% correct by the time it gets to review. You’re not spending enough time on writing a great spec, and thats why the implementation deviates. Incorrect decisions earlier cascade down. Its all comes down to back pressure.

English
0
0
1
80
Lee Moore
Lee Moore@leegmoore·
@dexhorthy This is so god damn on point. Bro get out of my head
English
0
0
1
141
dex
dex@dexhorthy·
If you feel like PR review / code review is the bottleneck in your system - figure out how to increase the odds that the code is 95% or 99% correct by the time it gets to review. You’re not spending enough time on writing a great spec, and thats why the implementation deviates. Incorrect decisions earlier cascade down. Its all comes down to back pressure.
English
8
7
90
7.9K
Lee Moore
Lee Moore@leegmoore·
Claude 5x or 20x max + gpt plus. use claude for front end. or z.ai coding plan pro + gpt plus. this gets you a fair amount of GLM or Claude for front end and bread and butter work and enough codex to for some back end, pedantic reviewer, pedantic verification.
English
0
0
0
98
Rijn
Rijn@RijnHartman·
need help choosing the next ai coding plan gpt-5.5 has felt dumber for me over the last 2 days and i’m hitting limits claude code, cursor, or wait for 5.6? or something else? i mainly care about frontend quality (for what i'm working on rn) and not running out of usage
English
39
1
34
8.8K
Lee Moore
Lee Moore@leegmoore·
GLM 5.2 is an open weight model that crossed the Opus 4.5 inflection point. The excitement isn't hype. It's recognition of significance.
English
0
0
0
47
Lee Moore
Lee Moore@leegmoore·
@bentlegen It's not all Machiavellian Machinations. GLM 5.2 has brought crossed the Opus 4.5 inflection point in coding. This is a big fucking deal for open weight
English
0
0
0
248
Lee Moore
Lee Moore@leegmoore·
@mattpocockuk It’s a side effect of distilling a smaller model for coding activities and related long horizon agentic work, it gets worse at general tasks (like working out how to teach solving a rubics cube) This isn’t an opus 4.6 replacement for all tasks, just some big things like coding
English
0
0
0
185
Matt Pocock
Matt Pocock@mattpocockuk·
This is a bad thing, btw - if a model takes 3 turns to exit the smart zone it's bad
English
9
1
226
27K
Matt Pocock
Matt Pocock@mattpocockuk·
GLM-5.2 is a monster thinker Trying it with pi and my /teach skill, learning to solve the cube Even on the lowest 'effort' (high) it spits out longer thinking traces than anything I've ever seen 3 turns, 2-3 file reads, nearly 220K (!) of thinking traces
English
91
47
1.8K
212K
Lee Moore
Lee Moore@leegmoore·
They also work well for language ports of modular software with lots of tests. Port tests for a module then port the module. Rinse repeat for all modules. If 80% of the code is like that you can knock all that out with a decent control loop This is why Jared Sumner felt comfortable using Mythos and Dynamic Workflows (Claude code control loop generator) to port (many say slop is a better verb than port) Bun to Rust
English
0
0
0
883
Armin Ronacher ⇌
Armin Ronacher ⇌@mitsuhiko·
I decided to do some experiments with looping over the weekend. The only cases where they work so far for me are a) review b) research c) autoresearch. If someone uses them for actual implementation on a medium sized project, would love to have something to look at!
English
55
6
382
125.8K
Lee Moore
Lee Moore@leegmoore·
They also work well for language ports of modular software with lots of tests. Port tests for a module then port the module. Rinse repeat for all modules. If 80% of the code is like that you can knock all that out with a decent control loop This is why Jared Sumner felt comfortable using Mythos and Dynamic Workflows (Claude code control loop generator) to port (many say slop is a better verb than port) Bun to Rust
English
0
0
1
1K
Lee Moore
Lee Moore@leegmoore·
@github I've checked my account. I don't see an extra 200
English
3
0
4
2.2K
GitHub
GitHub@github·
Weekends are for building. Copilot Max users, check your account for an extra $200 in credits to power your next build in the GitHub Copilot app. Stand by for more offers for Pro and Pro+ users.
English
61
50
581
127.1K
Lee Moore
Lee Moore@leegmoore·
WTF X? Talk about fake trolling news. That shit is next level gaslighting for those of us still in the 27 stages of grief about Fable.
Lee Moore tweet media
English
1
0
1
99
Lee Moore
Lee Moore@leegmoore·
I've been messing with this off and on. Do you extend lint rules in language linters like eslint or do you write custom cli wired into bun/pnpm scripts for different layers of deterministic code quality/Architecture adherence linting? or do you do it it another way? I agree it's not sufficient but it's another dev time feedback mechanism to help keep agents on rails and I don't see many folks sharing their specific techniques and tips on how they do it
English
1
0
4
1.1K
dex
dex@dexhorthy·
you should have a linter Hands down You should have detailed rules, you should push determinism as far as it can go Use ast analysis to tell your coding agents what needs to be fixed You should absolutely do this BUT If your anti-slop strategy is an LLM and a handful of linters You’re gonna be disappointed
English
29
17
406
32.6K
Lee Moore
Lee Moore@leegmoore·
@ZackKorman As a Principal Engineer at one of these scrub companies I 100% agree. I wasn't disagreeing with the shitty part, just the what flavor of shitty it was
English
0
0
1
32
Zack Korman
Zack Korman@ZackKorman·
@leegmoore Scrub f500 companies happen to have a huge amount of software and systems
English
1
0
1
269
Zack Korman
Zack Korman@ZackKorman·
The “delay Mythos so cybersecurity can prepare” crowd believes companies proactively invest in cybersecurity to defend against future threats. You sweet summer child.
English
61
98
985
39.5K
Lee Moore
Lee Moore@leegmoore·
Nothing. Though I'd say GLM-5.2 has opus 4.6 like capabilities and 4.7 and 4.8 are functionally regressions (despite the benchmarks) so current glm 5.2 hit original 4.6 performance (4 months old) so fable from china in 3-6 months? Probably pre-training it now. Unless they need an unnerfed Mythos/Fable to distill from , then it will be a bit longer
English
0
0
3
344
Carlos E. Perez
Carlos E. Perez@IntuitMachine·
If GLM-5.2 has Opus 4.8-like capabilities, then what prevents Z.ai from creating a Mythos/Fable level AI? Is the cat already out of the bag wrt higher capable models?
English
23
3
45
6.8K
Lee Moore
Lee Moore@leegmoore·
Funny how it's hype (or fake news) when we disagree. I suppose it's easier to call hype rather than hold the tension of "A lot of smart people actively engaged with this stuff currently disagree with me". Smart people could be wrong and you could be right. Or vice versa. Time will tell. I personally think GLM is 5.2 opus 4.5/4.6 level. And both 4.6 and 5.2 function better and more reliably than opus 4.7 or 4.8. GLM 5.2 hitting opus 4.5/4.6 level isn't that big a deal. it's 4-5 months behind still. So the excitement IMO is a function of 4.7 and 4.8 having 1 in every 5-8 turns being a total dumbass in an unpredictable way has lots of weird distillation and RL tics that GLM 5.2 doesn't have. So we've been living for 2 versions and several months in opus regressions.
English
0
0
1
1K
Lee Moore
Lee Moore@leegmoore·
@ryanflorence You're almost there. Here's how you do it properly. /goal Do my job for me. Don't make mistakes.
English
0
0
1
113
Ryan Florence
Ryan Florence@ryanflorence·
"Do my job for me" Is this a loop?
English
9
1
44
6.6K
Lee Moore
Lee Moore@leegmoore·
@thdxr Sounds like the teaser tweet for the new 8 Day Agents product about to drop.
English
0
0
1
523
dax
dax@thdxr·
you're not working hard enough our team works 8 days a week
English
107
8
800
50.2K
Lee Moore
Lee Moore@leegmoore·
Feels like a primitive is being made the paradigm by a lot of folks. That's been my primary issue with proper ralph loops/control loops. They are a great paradigm for very specific types of projects. But for broader software construction using effective functional and technical specs, they are 1 primitive among a number of orchestration primitives that are generally needed to make up a proper general agentic building process with ai. So I guess my issue isn't so much with the loops, but the way they loop back around to be the silver bullet every 3-4 months.
English
0
0
1
54
dex
dex@dexhorthy·
@leegmoore state machine powers a control loop. you still have to design the CSW/DSW
English
1
0
1
324
dex
dex@dexhorthy·
stop writing loops, start writing *control loops* - read current state - read desired end state - one incremental change - repeat* simplest example is a thermostat, but for your code base there’s a reason why some of the best AI coders I know (doing Ralph-style work since Jan 2025) are all former kubernetes people * repeat on whatever interval you desire, we do a lot of these nightly, and small sprints ah hoc during dev
English
33
21
394
23.2K