Waleed Kadous

75 posts

Waleed Kadous

Waleed Kadous

@waleedk

Founder, @Cluesmith — makers of Codev + Multisage. Ex AI Eng Lead @Canva, Chief Scientist @anyscalecompute, Eng Strategy @Uber, Principal Engineer

San Francisco Bay Area, USA Katılım Mart 2009
183 Takip Edilen873 Takipçiler
Waleed Kadous
Waleed Kadous@waleedk·
The real question isn't "is AI fast?" — it's "can AI be disciplined?" Unstructured AI coding is fast but produces 2.9x fewer tests and zero deployment artifacts. Structured AI coding costs more but catches bugs that would otherwise ship silently.
English
1
0
0
57
Waleed Kadous
Waleed Kadous@waleedk·
I used AI to manage AI that writes code. Just published a deep dive on what that actually looks like — directing autonomous AI builders on an 80K-line TypeScript codebase. Here's what worked, what didn't, and the real numbers. 🧵
English
1
0
1
128
Waleed Kadous
Waleed Kadous@waleedk·
@omarsar0 Facing the same user experience. I have a repo called "life" that is connected to GDrive, GCal and WA, and whenever I need a skill (e.g. notion integration), I just construct it on the fly. I just don't understand what the big deal is.
English
0
0
0
86
elvis
elvis@omarsar0·
To be clear, it's a cool project. I am asking because I want a good reason to give it a try on an old MacBook I have sitting around, but I haven't seen any special use case that I don't have covered with my setup (basic agent harness sitting on top of cc and codex). Making it easier to do cool things with agents is not what I am looking for. Often, that leads to unnecessary complexity that can be prevented in the first place by choosing the right set of tools and being in more control of the architecture. I am genuinely curious if anyone has figured out a cool and unique use case to test. I think the memory stuff is interesting, but I believe my extremely simplified memory system is working amazingly well for me at the moment.
English
14
1
14
11.3K
elvis
elvis@omarsar0·
There isn’t a single use case that I have seen done with OpenClaw that I can’t do with Claude Code. What am I missing? I don’t use Telegram for real work but is there anything else?
English
206
13
337
79.1K
Waleed Kadous
Waleed Kadous@waleedk·
Each LLM was run 10 times on 2 platforms (local Mac and GCloud) — 60 runs total. About $40 in tokens for the full benchmark.
English
0
0
0
51
Waleed Kadous
Waleed Kadous@waleedk·
Codev makes this easy — Codev Cloud lets you connect your own GCloud box and manage builders from your browser or phone. No complex setup. codevos.ai
English
1
0
0
61
Waleed Kadous
Waleed Kadous@waleedk·
PSA: If you're using Gemini CLI for coding, it runs 4.7x faster on a GCloud box than locally. We benchmarked Gemini CLI, Codex CLI, and Claude Code on the same task — local vs cloud. The difference for Gemini is wild.
Waleed Kadous tweet media
English
1
0
3
569
Waleed Kadous
Waleed Kadous@waleedk·
I don't know if this scales to teams yet. Complex specs still overwhelm reviewers. False positive rate is ~18%. I'd rather you try it knowing the limits. If you want to see what structured AI development looks like: codevos.ai
English
0
0
0
31
Waleed Kadous
Waleed Kadous@waleedk·
Around v2.0.7 something clicked. Codev builds itself — self-hosted from day one. Early on it was clunky. Builders needed hand-holding. Reviews missed things. Then the protocols matured. 85% autonomous completion rate. The tool building itself, well enough to trust.
English
1
0
0
37
Waleed Kadous
Waleed Kadous@waleedk·
Codev 2.0 just shipped. It's an OS for humans and AI to build production code together — not vibe coding, real software. The velocity chart tells the story: sustained double-digit issue closes per day, accelerating through the sprint. codevos.ai/getting-started
Waleed Kadous tweet media
English
2
0
2
125
Waleed Kadous
Waleed Kadous@waleedk·
Every so often an AI agent does something uncannily human. I told my backend architect to introduce itself to the frontend architect. No script, just "say hi." The message was perfectly sensible. But it closed with "Looking forward to working together."
Waleed Kadous tweet media
English
0
0
2
115
Waleed Kadous
Waleed Kadous@waleedk·
🚀 Codev v1.5.x "Florence Series" is here! ✨ Secure Remote Access - SSH tunnel to Agent Farm from anywhere 🎨 3D Model Viewer - Native STL/3MF support with auto-reload 🏗️ New af architect command for streamlined workflows Install: npm install -g @cluesmith/codev Details: github.com/cluesmith/code…
English
0
0
1
89
Waleed Kadous
Waleed Kadous@waleedk·
I'm back on X! I'll start more about what I'm working on at Cluesmith.
English
1
0
2
70
Waleed Kadous
Waleed Kadous@waleedk·
A concrete example of why multi-agent approaches win: every agent has strengths and weaknesses. GPT-5.2 is _really good_ at finding near-duplicate code and refactoring it. This is on Codev (github.com/cluesmith/codev). GPT-5.2. outperformed Claude by 115% in identifying duplicate code. The entire codebase is ~15,000 lines so roughly speaking, Claude got rid of 1% and GPT-5.2 got rid of another 1%. That is a _huge_ difference. I verified that the stuff GPT-5.2 was actual duplicate code.
Waleed Kadous tweet media
English
0
0
4
123