fullofcaffeine

4K posts

fullofcaffeine

fullofcaffeine

@FullOfCaffeine

Planet Earth Katılım Aralık 2008
5.3K Takip Edilen733 Takipçiler
fullofcaffeine
fullofcaffeine@FullOfCaffeine·
@burkov @thsottiaux Yes! It requires quite a bit of steering to produce the output that Opus often gets right the first time. It means that Opus is often much better for documentation.,
English
0
0
0
248
BURKOV
BURKOV@burkov·
@thsottiaux It's not good in explaining a problem or a solution in simple English.
English
11
0
83
4.9K
Tibo
Tibo@thsottiaux·
Hello builders. What are we getting wrong with Codex, what can we improve?
English
2.4K
63
2.8K
302.8K
fullofcaffeine
fullofcaffeine@FullOfCaffeine·
GPT 5.4 is not good at writing user-friendly docs. To be more specific: GPT usually tends to be way too terse and technical and use terms that might not be familiar to the end-user, e.g in a public README. Steering often fixes that though. Opus 4.5 is much better at capturing my intentions given the context, it's prose is often much better by default. And ofc, its design skills. Opus is still better! And finally - and foremost - the current Codex outage, happening just now: github.com/openai/codex/i….
English
0
1
1
191
fullofcaffeine
fullofcaffeine@FullOfCaffeine·
@RCallsign @sudoingX Well, I can only hope it will become cost and time effective! I use and like the closed-weight frontier models, but do you really want to live in a world where you depend solely on them? That's pretty depressing and dangerous.
English
1
0
0
296
Sudo su
Sudo su@sudoingX·
hey if you have a 3060, or any GPU with 8GB or more sitting in a drawer right now, that thing can run 9 billion parameters of intelligence autonomously. and you don't know it yet. 2 hours ago i posted that 9B hit a ceiling. 2,699 lines across 11 files. blank screen. said the limit for autonomous multifile coding on 9 billion parameters is real. then i audited every file. found 11 bugs. exact file, exact line, exact fix. duplicate variable declarations killing the script loader. a canvas reference never connected to the DOM. enemies with no movement logic. particle systems called on the class instead of the instance. fed that list as a single prompt to the same Qwen 3.5 9B on the same RTX 3060 through Hermes Agent. it fixed all 11. surgically. patch level edits across 4 files. no rewrites. no hallucinated changes. game boots. enemies spawn, move, collide. background renders. particles fire. and here's what nobody is talking about. this is a 9 billion parameter model running a full agentic framework. Hermes Agent with 31 tools. file operations, terminal, browser, code execution. not a single tool call failed. the agent chain never broke. most people think you need 70B+ for reliable tool use. this is 9B on 12 gigs doing it clean. the model didn't fail. my prompting strategy did. the ceiling is not the parameter count. the ceiling is how you prompt it. this is not done. bullets don't fire yet. boss fights need wiring. but the screen that was black 2 hours ago now has a full game rendering in real time. iterating right now. anyone with a GPU from the last 5 years should be paying attention to what is happening right now.
Sudo su tweet mediaSudo su tweet mediaSudo su tweet media
Sudo su@sudoingX

9B on a 3060. 2,699 lines. 11 files. blank screen. Qwen 3.5 9B Q4 running through Hermes Agent wrote the full Octopus Invaders project autonomously. config, audio, particles, background, enemies, player, ui, game loop, README. structured the directory, separated concerns, documented everything. then it selfdiagnosed 10 bugs and patched them across files. fixed variable references, missing classes, broken directory paths. even created a reusable Hermes skill for future game builds unprompted. the code reads like a senior dev wrote it. clean architecture, proper separation, professional naming. but CONFIG.canvas is null on line 1 of initGame(). the game crashes before a single frame renders. 9B understands structure. it can architect, scaffold, and debug individual files. what it can't do is hold 10 files in context and wire them together correctly. duplicate Bullet classes across two files with incompatible interfaces. static method calls on instance based classes. enemies that spawn but never move because there's no y += speed. 35B on a 3090 built 3,483 lines in one pass and it ran. 9B built 2,699 lines across multiple iterations and the screen is black. the ceiling for autonomous multifile coding on 9B parameters is real. still iterating. trying a singlefile version of the same prompt next to isolate what 9B can actually close on.

English
112
250
2.7K
680.4K
fullofcaffeine
fullofcaffeine@FullOfCaffeine·
@sudoingX I have a sparing Quadro RTX 5000 with 16GB. Wondering if it'd at least get close to that? Great content, btw!
English
0
0
0
536
Tyler
Tyler@rezoundous·
Dear Codex, please get better at UI so I can unsubscribe Claude and Gemini.
English
120
50
1.9K
117.6K
fullofcaffeine
fullofcaffeine@FullOfCaffeine·
You can do that, and it might work reasonably well. The difference is that the agent needs to do less and the oracle implements it deterministically. And oracle implements and abstracts the browser communication/orchestration side (if you're using the browser) which would be a nightmare for the agent to do everytime. If you're using the API only, you might be able to implement a similar system via instructions via AGENTS.md or a skill or just prompt it everytime.
English
0
0
1
15
Brendan Smith, Ph.D.
Brendan Smith, Ph.D.@mrbsmith58·
@FullOfCaffeine @VictorTaelin Will check it out! Quick question .. how is Oracle different from having codex zip relevant files and writing a markdown defining the files and issue at hand?
English
1
0
1
19
Taelin
Taelin@VictorTaelin·
Quick 2am success story: asked GPT-5.4 to simplify Bend2's elaborator; 4h later, no real improvements. Asked it to write a big prompt asking help, passed it to 5.4 *Pro*, pasted the response back to codex, which landed a massive simplification landed. Seems like the pro version enlightened it. Perhaps a nice feature to have natively on Codex would be to just pause what it is doing and invoke the pro version for a plan. This was my first time using pro and it was definitely worth it.
English
40
5
453
37.4K
fullofcaffeine
fullofcaffeine@FullOfCaffeine·
@VictorTaelin You can use something like github.com/steipete/oracle to automate that. The browser integration is flaky, but I've made it work better locally, with some additional fixes. It also supports Pro via the API, which is, of course, reliable.
English
1
0
0
42
fullofcaffeine
fullofcaffeine@FullOfCaffeine·
@VictorTaelin Pro is a beast. I love it. This workflow you described is great, it's almost a "secret sauce"-kind-of-thing. Too bad it's very slow, but it's worth it most of the time.
English
1
0
1
945
fullofcaffeine
fullofcaffeine@FullOfCaffeine·
Codex requests are extremely slow at the moment. In fact, it seems to be hanging at this very moment. GPT 5.4 high/xhigh. Tried to interrupt one of them to get out of the deadlock, asked codex to continue and got this error message: `■ stream disconnected before completion: An error occurred while processing your request. You can retry your request, or contact us through our help center at help.openai.com if the error persists. Please include the request ID d424b136-c4ef-42ae-bf12-7bb0c0df7028 in your message.` Is it due to capacity/API issues? cc @thsottiaux
English
1
0
1
206
fullofcaffeine
fullofcaffeine@FullOfCaffeine·
@georgepickett xhigh can be counter productive sometimes. Wish codex could auto-adjust between high/xhigh as needed! Haven't tried fast yet, though.
English
0
0
1
80
George Pickett
George Pickett@georgepickett·
btw - this is not a diss to Codex, more of a brag. I have 5.4 on xhigh and on fast mode. I know the implications of that. If I didn't get extreme value out of running agents like this I would not be using it 10 hours/day
English
8
0
44
4.5K
George Pickett
George Pickett@georgepickett·
burned 31% of a chatgpt pro sub in one 5hr session not looking forward to April 2 when 2x rate limits go away.
English
61
7
640
57K
fullofcaffeine
fullofcaffeine@FullOfCaffeine·
@BLCNYY It's not always the right reasoning level, there are drawbacks depending on the case, not to mention it's slower. `High` is a sweet spot.
English
0
0
0
164
BLCNYY
BLCNYY@BLCNYY·
Am I the only one who always uses the Extra High reasoning effort in Codex, regardless of how hard the task is? 🤔
BLCNYY tweet media
English
121
4
439
78.6K
fullofcaffeine
fullofcaffeine@FullOfCaffeine·
@therealnoozo Bashing how? I've been seeing a lot of praise in my timeline instead? Love the BEAM, btw!
English
1
0
3
381
{:ok, %Pedro{}}
{:ok, %Pedro{}}@therealnoozo·
So many people bashing the BEAM lately. I couldn't care less. All I know is that I moved to Elixir years ago and no other language/ecosystem has made me this happy. I don't need a million libraries. My stack is app and PostgreSQL. And our apps are neither small nor simple by any count either. We have AI, video processing, transcriptions, and many other things in the same monolith. No regrets.
English
14
12
189
8.2K
Numman Ali
Numman Ali@nummanali·
@Draja441 It stops going in circles, over engineering and second guessing itself
English
1
0
10
1.3K
Numman Ali
Numman Ali@nummanali·
The rumours are true After always being XHigh on Codex I can say with confidence That GPT 5.4 is better with High
English
27
3
300
21.8K
fullofcaffeine
fullofcaffeine@FullOfCaffeine·
Agreed! Are you using xhigh all the time for all kinds of projects, though? I've found high to still be very good, but it's hard to benchmark that quantitatively. The main open loop I have now is deciding the optimal thinking level for a given project/task. Heck, if I had unlimited credits, fine, I wouldn't even bother the slowness, but xhigh uses a lot of tokens ;)
English
0
0
1
200
Ryan Carson
Ryan Carson@ryancarson·
$200/mo ChatGPT Pro + gpt-5.4 xhigh + Codex Mac App is the most asymmetric upside I've ever seen in a tool chain. A ridiculous cheat code for founders.
English
142
62
1.7K
306.7K
Tomas Cupr
Tomas Cupr@tomcupr·
Opus 4.6 felt different for coding. Things just suddenly worked. GPT-5.4 xhigh in Codex feels like another leap. It's Opus 4.6 that goes deeper and considers much broader implications of its work and the whole development setup. Crazy. Amazing job @thsottiaux.
English
16
3
245
35.5K
fullofcaffeine
fullofcaffeine@FullOfCaffeine·
@SeloSlav Looks beautiful! Did you draw the assets or were they AI-assisted/generted?
English
0
0
0
96
Martin Erlić
Martin Erlić@SeloSlav·
10 months ago I did something slightly insane. I started building an entire 2D top-down multiplayer game engine… inside React. 3500 files later, hundreds of hooks, and somehow the game loop still runs shockingly well thanks to AI-assisted optimization. But the architecture is now totally out of control. Time to extract the engine from React before this thing collapses under its own hooks.
Martin Erlić tweet media
English
19
1
132
7.1K