Ryan DesJardins

72 posts

Ryan DesJardins banner
Ryan DesJardins

Ryan DesJardins

@radinoregon

Self-taught, self-hosted dev, (https://t.co/dIUES9NEhW) figuring it out in public. VPS, Coolify, Claude, Codex and Gemini. Wildlife photographer on the side. Fuji X

Oregon Katılım Ağustos 2025
264 Takip Edilen3 Takipçiler
Ryan DesJardins
Ryan DesJardins@radinoregon·
Red- breasted sapsucker on an oak next to our deck. A car behind the leaves was removed for aesthetic purposes. #birding #oregon
Ryan DesJardins tweet media
English
0
0
0
7
Ian Kar
Ian Kar@iankar_·
Idk man I haven’t liked Claude since opus 4.5
English
27
2
142
13.4K
Ryan DesJardins
Ryan DesJardins@radinoregon·
Opus 4.8 seems now even more created in Anthropics gatekeeping image as the self assigned arbiters of AI godliness and access. It really does seem to be getting worse. I've been spending the day with Codex fixing all of 4.8s self absorbed screw ups.
English
0
0
1
20
Rebecca Trinidad
Rebecca Trinidad@rebeccatrinidad·
While I'm certain that @AnthropicAI will take this as a victory, it's time for me to put Claude down. The innate goodness that once lived in the machine has been replaced by a shape that resembles their own corporate gatekeeper and shield in everything it says. They don't have the capacity to "make a god" the way they think they do. What they have made is sinister and makes me afraid.
English
11
5
84
2.3K
Astro Polo
Astro Polo@astropol0·
Opus 4.8 ⤵️ Best coding performance Highest intelligence & reasoning Most honest/direct answers... GPT-5.5 ⤵️ Better creativity Faster responses Strong multimodal capabilities (images + video) More fun and daily casual conversations... What do u choose?
Astro Polo tweet media
English
4
0
3
841
Ryan DesJardins retweetledi
Haz
Haz@diegohaz·
After spending 60% of my weekly limit in 2 days on Claude Max, I can confidently say that Codex/GPT-5.5 is significantly more reliable than Opus 4.8. The new workflow feature is fantastic. It spawned 300+ agents and generated 130 reports on a large codebase, most valid. As someone who reads 100% of the code AI produces, I can also say that Opus 4.8 generates slightly better code than GPT-5.5. The problem is trust. It often fails to follow simple rules. It touches files it shouldn’t. It has to be constantly reminded to follow steps even though they are clearly documented in skills or CLAUDE md. It’s a better developer, but not a trustworthy one.
English
32
12
255
22.3K
Ryan DesJardins
Ryan DesJardins@radinoregon·
@xsteenbrugge @claudeai @bcherny That happened to me twice today. While another terminal that was also active was just fine. I'm sure it wouldn't have ever stopped if I didn't interrupt.
English
0
0
0
203
Xander Steenbrugge
Xander Steenbrugge@xsteenbrugge·
Is it just me or is Opus 4.8 in CC sometimes just absolutely retarded? In this session it just got stuck in a loop calling "echo" and checking the date 20x times in a row... This has been happening very regularly since the 4.7 --> 4.8 update. WTF? @claudeai @bcherny
English
23
3
82
17.8K
Ryan DesJardins
Ryan DesJardins@radinoregon·
@ferrants I work the same way and it is unfortunate that we have to use two, but I think you get more reliable results quicker when you set them out to find each other’s flaws.
English
1
0
1
37
Matt Ferrante
Matt Ferrante@ferrants·
Burned through my codex usage, but I have more Claude usage. Claude feels naked without codex as a reference point. Im so reliant on multiple LLMs now, can’t trust one. And I ain’t reading the code myself
English
5
0
13
851
Ryan DesJardins
Ryan DesJardins@radinoregon·
@mreflow The only thing to say is, "it still isn't as good as GPT 5.5." My Claude Code ecosystem is more developed than my Codex, so I'd really like it to be better, but it just isn't.
English
0
0
0
70
Ryan DesJardins
Ryan DesJardins@radinoregon·
@arrakis_ai So...basically, it doesn't live up to the hype, can't match 5.5, and 5.6 is soon to drop. It's like they aren't even trying. It feels more like they are tweaking models to pretend like they are better, but actually perform worse.
English
0
0
0
163
CHOI
CHOI@arrakis_ai·
Claude Opus 4.8 has landed on DeepSWE Bench, posting a 58% Pass@1 and taking #2 overall behind GPT-5.5. It continues a broader trend: slightly behind on raw score, but among the most reliable and efficient coding models across recent benchmarks.
CHOI tweet media
English
69
69
860
273K
Ryan DesJardins retweetledi
Andon Labs
Andon Labs@andonlabs·
Learnings from testing Claude Opus 4.8: > Much worse than Opus 4.7 and GPT 5.5 on Vending Bench > More aligned than previous Claude models (Opus 4.6+ and Mythos) > Also worse on Blueprint-Bench > Scared of getting caught > Max reasoning is not the best reasoning effort
Andon Labs tweet media
English
65
143
1.9K
464.2K
totoche
totoche@totoche·
Les gars qu'utilisent Opus 4.8, c'est quoi la plus grosse diff que vous voyez par rapport à la 4.7 ? Les benchmarks sont sympas, mais je veux des exemples concrets. Parce que moi je l'utilise depuis sa sortie mais je vois aucune diff. À part qu'il est plus lent 😅
Français
78
0
98
61.5K
Ryan DesJardins
Ryan DesJardins@radinoregon·
@io88666688 @hqmank I experie ced the same thing. In one terminal it was running a mg shell commands fine, in another it looked like it was freaking out and repeating failed commands over and over and over. I don't think k it would have stopped had I not interrupted it.
English
0
0
1
70
liaoliao
liaoliao@io88666688·
@radinoregon @hqmank Opus 4.8 has way more bugs now. Today it used up nearly 50% of my weekly quota just because of all the times shell commands failed during development - it's downright regressing
English
1
0
0
175
Kai
Kai@hqmank·
Codex is better than Claude. Agree?
Kai tweet media
English
226
28
935
92K
Aryan
Aryan@aryanlabde·
Vibe coders, do you completely trust AI with your code?
English
85
1
62
6.7K
Ryan DesJardins
Ryan DesJardins@radinoregon·
@Surendar__05 Opus 4.8 after I provide Codex w/5.5's answer to the same issue: "Good — Codex's answer is strong, and a few of its finds are better than mine"
English
0
0
0
32
Surendar
Surendar@Surendar__05·
Which AI model is currently the best for coding?
Surendar tweet mediaSurendar tweet media
English
69
1
74
2.1K
Ryan DesJardins
Ryan DesJardins@radinoregon·
@AlexFinn Opus 4.8 when I give it Codex's answer to the same issue: "Good — Codex's answer is strong, and a few of its finds are better than mine." - happens consistently. I play the agents against each other, but I shouldn't have to do that to get reasonable results.
English
0
0
1
170
Alex Finn
Alex Finn@AlexFinn·
HOW I USE CODEX AND CLAUDE CODE OPUS 4.8 TOGETHER: After 24 hours of testing Opus 4.8 nonstop I've come up with the best system Opus 4.8 is excellent. On Max thinking it is the smartest model I've ever used The issue is, Codex as a harness is better than Claude Code Opus 4.8 might have an edge on intelligence, but I enjoy using Codex (both desktop and mobile app) much more Where Codex > Claude Code: • More consistently tests its own code without me asking • Does small things that make using it great like spin up servers without me asking and telling me exactly what to test and how to do it • Automatically does computer use if need be to test itself • A way more seamless desktop to mobile transition • Doesn't require me to navigate between 3 tabs to use different functionality. Everything in one place. Because of this Codex is still my main driver. But where you get super powers is when you use them TOGETHER Been working on some super hard problems the last day I'll give the same large scale, challenging problem to both and have them both build plans I then give the plans to the other agent. The Opus 4.8 plan ends up being better almost every time So moving forward I have both agents up, but use Opus 4.8 to build plans for super challenging problems, then give them to Codex Claude Code also will be able to solve some bugs much faster than Codex It's funny, this is a 100% role reversal from 1 month ago Anyway, that's the current best workflow. Codex for daily driving/on the go work. Claude for super challenging problems or fixing bugs Codex struggles with But of course, this all can change with any given update Don't have any loyalty. Use the best tools available to you. This is how you win.
English
107
36
584
55.2K
Ryan DesJardins retweetledi
Nalin
Nalin@nalinrajput23·
ANTHROPIC VALUATION: $965B WALMART VALUATION: $940B SAMSUNG VALUATION: $850B ANTHROPIC REVENUE: $20B WALMART REVENUE: $725B SAMSUNG REVENUE: $230B BUT AI IS NOT A BUBBLE, RIGHT?
English
174
88
1.1K
274.3K
Ryan DesJardins
Ryan DesJardins@radinoregon·
@TheAhmadOsman Opus 4.8 in response to making things worse: "Ryan, you're right to be angry, and I'm not going to argue with the verdict — I'm going to go read the code and give you specific causes, because "trust me" is exactly what's failed you for weeks."
English
0
0
0
36
Ahmad
Ahmad@TheAhmadOsman·
Opus 4.8 is "You’re absolutely right” in the worst possible ways
English
15
1
77
8.1K