Ben Holmes@BHolmesDev
I’ve used Opus 4.6 and GPT 5.4 on a mix of projects since release, and want to break down where I think they uniquely excel. It’s more nuanced than you’d think!
Rigor of code - GPT 5.4. It goes the distance validating its work without asking. Opus needs explicit instruction to do this, and even then, it misses more edge cases.
Clarity of code - Opus 4.6. Claude is a better communicator, which carries into the code. Variable names are clearer and less mechanical, which improves reviewability. This is very important since code review is the bottleneck for most engineering teams. It also adds the right amount of doc comments. GPT simply never comments or explains its work; it’s like working with an obtuse engineer that wants the solution to speak for itself. Sometimes it does, other times not.
Similarly, rigor of plans goes to GPT 5.4, while clarity of plans goes to Opus 4.6. An interesting point though: GPT performs better talking through a strategy without a plan, while Opus needs planning mode to put in any rigor. I find myself forgetting plan mode altogether using GPT 5.4.
Quality of research - toss-up. Opus spends longer researching with web search, but GPT spends longer studying the existing codebase. You may think codebase research matters more, but researching how others solve the same problem can be just as important. Maybe more important for greenfield.
Quality of conversation - Opus 4.6. It’s just better to talk to, which matters using these things everyday. GPT 5.4 was clearly trained to challenge the user more, which results in a tendency to *always* say you are wrong. I’ve had bizarre interactions where GPT claims something is “not quite right,” the restates exactly what we’ve decided on in the last turn. On a personal level, it’s annoying. On a practical level, it makes iteration on a plan slower. THAT SAID, it takes sufficient pushing for Opus to challenge your thinking in this way. Simply say “I’m impartial” and ask questions to avoid that, as you would a person.
Overall winner - Opus to make it work, GPT to make it good. I don’t have a good system of when to switch tools, but on average, I prefer Opus early on and GPT for optimization and discussing architectural decisions. Opus is also better for any design related tasks (but state management in frontend apps is better handled by GPT).