coderofstuff
1.7K posts

coderofstuff
@coderofstuff_
There are 10 types of people in the world - those who know binary and those who don’t | coding DAGKnight https://t.co/WZfxjGEEgq











hey if you have a 3060, or any GPU with 8GB or more sitting in a drawer right now, that thing can run 9 billion parameters of intelligence autonomously. and you don't know it yet. 2 hours ago i posted that 9B hit a ceiling. 2,699 lines across 11 files. blank screen. said the limit for autonomous multifile coding on 9 billion parameters is real. then i audited every file. found 11 bugs. exact file, exact line, exact fix. duplicate variable declarations killing the script loader. a canvas reference never connected to the DOM. enemies with no movement logic. particle systems called on the class instead of the instance. fed that list as a single prompt to the same Qwen 3.5 9B on the same RTX 3060 through Hermes Agent. it fixed all 11. surgically. patch level edits across 4 files. no rewrites. no hallucinated changes. game boots. enemies spawn, move, collide. background renders. particles fire. and here's what nobody is talking about. this is a 9 billion parameter model running a full agentic framework. Hermes Agent with 31 tools. file operations, terminal, browser, code execution. not a single tool call failed. the agent chain never broke. most people think you need 70B+ for reliable tool use. this is 9B on 12 gigs doing it clean. the model didn't fail. my prompting strategy did. the ceiling is not the parameter count. the ceiling is how you prompt it. this is not done. bullets don't fire yet. boss fights need wiring. but the screen that was black 2 hours ago now has a full game rendering in real time. iterating right now. anyone with a GPU from the last 5 years should be paying attention to what is happening right now.


if you're running Qwen 3.5 on any coding agent (OpenCode, Claude Code) you will hit a jinja template crash. the model rejects the developer role that every modern agent sends. people asked for the full template. here it is. two paths depending on which model you're running: path 1: patch base Qwen's template. add developer role handling + keep thinking mode alive. full command: llama-server -m Qwen3.5-27B-Q4_K_M.gguf -ngl 99 -c 262144 -np 1 -fa on --cache-type-k q4_0 --cache-type-v q4_0 --chat-template-file qwen3.5_chat_template.jinja template file: gist.github.com/sudoingX/c2fac… without the patched template, --chat-template chatml silently kills thinking. server shows thinking = 0. no reasoning. no think blocks. check your logs. path 2: run Qwopus instead. Qwen3.5-27B with Claude Opus 4.6 reasoning distilled in. the jinja bug doesn't exist on this model. thinking mode works natively. no patched template needed. same speed, same VRAM, better autonomous behavior on coding agents. weights: huggingface.co/Jackrong/Qwen3… both fit on a single RTX 3090. 16.5 GB. 29-35 tok/s. 262K context.





@coderofstuff_ @emdin @IzioDev @averagecatdog Built a simple visualizer showing how the same TX gets included in multiple blocks. I might be thinking incorrectly. rossku.github.io/kaspa-tx-sprea…



Rusty Kaspa v1.1.0 is out. Faster syncing, less storage, and exchanges/wallets building on Kaspa just got a much easier time of it. The big one for integrators is a new API call that returns chain updates and transaction data together in one go, instead of having to juggle multiple parallel requests. If you've ever tried to integrate Kaspa and cursed at the DAG complexity, this is the fix. Node operators get up to 3x faster sync in the early header stage on some machines, plus lower disk usage. The stratum bridge also shipped as beta if you're running mining infrastructure. One heads-up - DB version bumped to 6. Upgrade is automatic, but you can't roll back to an older version without wiping the DB. github.com/kaspanet/rusty…






🎉Good news! 📢 The long-awaited @Ledger integration is complete! Securely manage and store your $KAS on Nano S, Nano S+ and Nano X! Download the #Kaspa app via #LedgerLive and use kasvault.io to interact with your new app. Ledger Guide: support.ledger.com/hc/en-us/artic… (Link to KASVault user guide at the bottom) #L1 #ProofofWork #DigitalSilver #CryptoStorage




if you try to run qwen 3.5 27B with OpenCode it will crash on the first message. OpenCode sends a "developer" role. qwen's template only accepts 4 roles: system, user, assistant, tool. anything else hits raise_exception('Unexpected message role.') and your server returns 500s in a loop. unsloth's latest GGUFs still ship with the same template. the bug is in the jinja, not the weights. no quant update will fix it. the common fix floating around is --chat-template chatml. it stops the crash. it also silently kills thinking mode. your server logs will show thinking = 0 instead of thinking = 1. no think blocks. no chain of thought. you're running a reasoning model without reasoning and the server won't tell you. the real fix: patch the jinja template to handle developer role + preserve thinking mode. add this to the role handling block: elif role == "developer" -> map to system at position 0, user elsewhere else -> fallback to user instead of raise_exception full command with the fix: llama-server -m Qwen3.5-27B-Q4_K_M.gguf -ngl 99 -c 262144 -fa on --cache-type-k q4_0 --cache-type-v q4_0 --chat-template-file qwen3.5_chat_template.jinja thinking = 1 confirmed. full think blocks. no crashes. that's what's running in the video in the thread below. if you've been using chatml as a workaround, check your server logs for thinking = 0. you might be running half a model.



