
kapicode
74 posts

kapicode
@kapicode
Building in public. Currently working on a custom Ralph implementation—harness-agnostic and has a TUI for progress https://t.co/eYMJvNLIuI



Tested Qwen 3.5 122B + MTP on Strix Halo, made a sweep of kyuz0 pre-built optimised images. Overall, results are a mixed bag, it's a bit faster, but not enough to make it fluid in agentic workflows, but makes it very usable for chats. llama.cpp args: ``` llama-server --no-mmap -dio -ngl 99 -np 1 --kv-unified --spec-type draft-mtp --spec-draft-n-max 3 ```



Running @antirez DS4 vs Nemotron Nano 30B as coding-agent orchestrators on a GB10. The surprising part so far: the model that's ~5–7x slower (ds4) might still be the better orchestrator brain. It comes down to recovery and cache reuse, not raw speed. Full numbers and runbook coming soon.




For every person who replies with a screenshot of their cancelled Claude Code plan, I will donate $10 to open source.
