Christian Waldow

380 posts

Christian Waldow

Christian Waldow

@cwaldow

Contrarian Investor, Data Scientist

Deutschland Katılım Ocak 2011
447 Takip Edilen83 Takipçiler
Christian Waldow
Christian Waldow@cwaldow·
@nummanali I use 400 million tokens with Minimax daily vs 200 million with GPT on plus weekly….
English
0
0
0
35
Numman Ali
Numman Ali@nummanali·
@cwaldow Really? You find the codex plan is not sufficient?
English
1
0
0
460
Numman Ali
Numman Ali@nummanali·
GPT 5.5 is beyond my expectations 9h 00m 49s of coherent work on a ML library Every night this week I have let it build with constraints through: - AGENTS .md - CONTINUITY .md - MEMORY .md - PLAN .md - .agents/skills Through the combination of these, it has been steering its own work, deciding what to pick up from a plan, how to approach it as a tranche, and dynamically updating its inbuilt task list. This morning I woke up to complete MLX support in TypeScript for Flux2, Z Turbo, and Qwen image generation. Mind blowing. This model is the biggest release we have had since The Great Awakening in December 2025. It has changed the way I approach a problem and how I construct the prompting guidelines and the infra around a project. It comes down to some simple principles that enable long-running, coherent agents: - clear instructions on outcomes, but freedom on solutions - dynamic memory workspace and task management - a verifiable repository with all gates in place - ability to create skills and self-improve on them through use - powerful sub-agent access to validate hypothesis in the absence of humans And the one last key thing here is the Codex CLI or the app, if you prefer. The magic that they have done with their compaction strategies is phenomenal. It is remarkable how the agent is able to stay coherent after multiple, if not hundreds, of compactions, especially when given the additional space to jot down its findings and updated approach. My recommendation is to ask Codex to optimise your repo for agentic engineering and apply its principles so that it is able to work for a long period of time. Hopefully, when I have more time, I will write a longer article on this, but I am very happy to answer any questions anyone might have and provide guidance to the best of my ability.
Numman Ali tweet media
English
20
36
536
30.7K
Christian Waldow
Christian Waldow@cwaldow·
@RnaudBertrand I do the same every workday with Minimax and it’s in the subscription. Of course you need a higher thinking model as architect and reviewer.
English
0
0
0
353
Arnaud Bertrand
Arnaud Bertrand@RnaudBertrand·
Hard to calculate exactly without an input/output split but I did the math and for 831,962,136 tokens, Anthropic Opus 4.7 would cost: - 100% input (floor, unrealistic): 831.96M × $5/M = $4,159.81 - 90/10 (typical for coding agents like opencode — most tokens are codebase context re-fed each turn): $3,743.83 + $2,079.91 = $5,823.74 - 80/20 (more conservative): $3,327.85 + $4,159.81 = $7,487.66 - 50/50 (worst plausible case): $2,079.90 + $10,399.53 = $12,479.43 So that $10.57 DeepSeek bill would probably become roughly $5,000–8,000 on Claude Opus 4.7. In other words, DeepSeek is 500–700× cheaper, for similar-ish capabilities. Now you start to understand why Anthropic is worried...
Khalid Warsame@KhalidWarsa

Is DeepSeek V4 Pro cheap? I consumed 831,962,136 tokens in under 2 days and paid $10 for it.

English
86
255
2.3K
243.2K
MaDdWiz
MaDdWiz@MonetaryMonsta·
@AdvicebyAimar Try mine. Mempalace is just archiving chats. Thats going to balloon over time with 24/7 agent workflows. My NovaSpine hybrid recall with lossless queryable intelligent compression that gets better overtime. Making improvements daily. More commits tomorrow github.com/maddwiz/NovaSp…
MaDdWiz tweet media
English
2
0
2
448
Aimar Haddadi
Aimar Haddadi@AdvicebyAimar·
i can spot a grifter from miles away. so i digged into the code to figure out if this is legit or not. guess i was right. ben is a crypto founder who runs some weird bitcoin lending platform, i was pretty sure he knows absolutely nothing about ai and memory so i tracked down the repo myself since i was curious. his website says he likes to build ai powered products and train local ai models? sure man, 80% of your github repo's are bitcoin related stuff. only one ai related project came up you forked in 2024. mempalace has 10k github stars, more than 1k forks but only.. 7 commits ? apparently the best memory layer to date? no git author history, no account connected to whoever wrote the code of this codebase. it doesn't add up.. the account who pushed the original repo, named: aya-thekeeper, under aya-thekeeper/mempal got deleted right after the repo got published. you paid a random guy named lu to build this shit out for you. ( "Written by Lu (DTL) — March 24, 2026. For: Ben." ) - benchmark md file. lu wrote the code. lu wrote the benchmarks. lu is nowhere in the readme. or mentioned in the github history? the git history then got squashed to one commit and published under milla jovovich? seriously? a actress? you say she is a great friend of yours, she has been building this project with you. she does this at night. yet she has.. 7 commits and only 2 active days in her entire github history? you paid an actress and a random guy to promote a product you know absolutely nothing about.
Ben Sigman@bensig

30 second explanation of the MemPalace by Milla Jovovich. By day she’s filming action movies, walking Miu Miu fashion shows, and being a mom. By night she’s coding. She’s the most creative, brilliant, and hilarious person I know. I’m honored to be working with her on this project… more to come.

English
346
719
6.4K
758.3K
Berat
Berat@beratfromearth·
@0xSero 5.3-Codex is indeed autistic and has no interest in chatting and doing anything else besides coding
English
1
0
8
984
0xSero
0xSero@0xSero·
GPT-5.3-Codex is still the best coding agent, no doubt about it. GPT-5.4 is better at computer use, but doesn't match the sheer autistic power Codex holds.
0xSero tweet media
English
63
14
896
81.1K
Christian Waldow
Christian Waldow@cwaldow·
@mkurman88 Other could be OpenCode for example. I did enjoy the free launch while it lasted, but it became ridiculously expensive recently.
English
0
0
0
94
Mariusz Kurman
Mariusz Kurman@mkurman88·
The business plan in Codex is the worst deal I've ever made. 903 credits since March 17th (when the reset occurred). 57% of the weekly limit used. WTF is OTHER, by the way? I didn't use it outside of Codex CLI.
Mariusz Kurman tweet mediaMariusz Kurman tweet mediaMariusz Kurman tweet media
English
1
0
4
2.4K
Christian Waldow
Christian Waldow@cwaldow·
@0xSero For me it’s vice versa. Codex has become very expensive these days
English
0
0
0
134
0xSero
0xSero@0xSero·
This is how I end my nights. Today: 1.5 Billion tokens in Codex 22M tokens GLM 51M tokens Kimi 41M tokens Claude 14M tokens MiniMax
0xSero tweet media
English
45
16
689
36.2K
0xSero
0xSero@0xSero·
Holy moly, MiniMax-M2.7 is amazing, watch till the end.
English
32
32
939
110.1K
Christian Waldow
Christian Waldow@cwaldow·
@ai_with_gregor @mkurman88 I did a port from R to Python with various older models and got 10x LOC. But it is already faster than the old one and has more features. Damn curious what gpt 5.4 or any other SOTA model would do to it.
English
0
0
0
27
Gregor Amon
Gregor Amon@ai_with_gregor·
@mkurman88 Why maintain when you can redo the thing in 3 days with a much better model in 6 months with all the learning you had from had a year of real life operation? Tech debt doesn’t exist anymore- but that’s need a change in mindset. We are not building for 20+ years now.
English
2
0
1
104
Mariusz Kurman
Mariusz Kurman@mkurman88·
After two months of heavy "coding" with AI agents, I have one conclusion: if your codebase already exists, is fully human-written, and you use agents to add or improve features, it works great. However, when you try to create something new from scratch, they tend to add so much overcomplicated spaghetti code that it's hard to maintain in the long run. No matter which coding model you use, sooner or later, you'll hit a wall you can't break through.
English
502
378
5.7K
477.8K
Christian Waldow
Christian Waldow@cwaldow·
@AzFlin @hemic_ No, but on 20 unrelated tickets at the same time. Planning is the most important skill in these days…
English
0
0
0
15
AzFlin 🌎
AzFlin 🌎@AzFlin·
@hemic_ Yup. And we’re supposed to believe people are letting 20 agents run rampant all at the same time
English
5
0
16
1.3K
AzFlin 🌎
AzFlin 🌎@AzFlin·
You 👏 Do 👏 Not 👏 Need 👏 Multi-Agent 👏 Orchestration 👏 Systems 👏
English
157
24
714
73.8K
Christian Waldow
Christian Waldow@cwaldow·
@mkurman88 Like before GPT shines as a planner - codex remains the coding model champ
English
0
0
2
288
Mariusz Kurman
Mariusz Kurman@mkurman88·
Boy, GPT-5.4 is so gooood inside Codex. It's smart, fast, and one-shot most of the tasks, but it stops frequently without checking if everything is okay, often skips tests, and doesn't try to build the project before claiming everything is finished - something GPT-5.3 Codex always did. Fix it, guys.
English
19
1
103
8.3K
Mariusz Kurman
Mariusz Kurman@mkurman88·
And luckily look what we have here
Mariusz Kurman tweet media
English
1
1
6
708
Christian Waldow
Christian Waldow@cwaldow·
@thdxr I also like Minimax as workhorse. But like any wild horse it takes an experienced rider…
English
0
0
0
40
dax
dax@thdxr·
it's been probably a month since i've used any claude models in opencode kimi/glm for most tasks since they're fast af and then gpt for everything big that i want to background great pairing
English
101
42
1.7K
69.7K
dax
dax@thdxr·
@franzstupar what are you observing? as of 2 weeks ago performance should be near identical
English
3
0
14
4.7K
Christian Waldow
Christian Waldow@cwaldow·
@0xSero Many are complaining about this model - but it’s about the instructions to tame it, right?
English
0
0
0
58
0xSero
0xSero@0xSero·
Local MiniMax just did a 30 minute run and built a perfectly functional twitter/x automation system from a scope. What a life.
0xSero tweet media
English
15
7
188
9K
0xSero
0xSero@0xSero·
When you send a message to an llm and something fails, you screenshot that prompt and response into another model to debug it will almost always not just fix the issue you ran into but also do whatever you prompted model A to do. It's very interesting how common this is.
English
3
0
5
1.1K
lyv ⌘
lyv ⌘@wholyv·
Okay I am impressed Kilocode > OpenCode If claude code can implement model agnosticism in their terminal then Claude Code > kilocode > opencode I don’t see any reason now why people would use Cursor or Antigravity
English
54
2
202
17.2K
Christian Waldow
Christian Waldow@cwaldow·
@nummanali Moderato didn’t convince me. Token limits felt going down week after week.
English
0
0
0
123