
qwen 3.7 api results look insane. when open weights? don't leave us local builders behind. if we want qwen 3.7 open weights, like and repost. let them hear us.
Derek Colley
4.3K posts

@DerekColley_
Consulting Technology Lead, CTO & CIO Building https://t.co/XGysrW0D1H

qwen 3.7 api results look insane. when open weights? don't leave us local builders behind. if we want qwen 3.7 open weights, like and repost. let them hear us.


so yesterday i dropped the bench numbers and what fits. today is the actual agent running on this 10 year old gpu card. qwen3 8b q4_k_m on a gtx 1080 8gb. hermes agent loaded with full tool set, browser controls live, nvtop pinned at 100% gpu 7.5gb of 8gb vram occupied. the unsloth weights pulled directly from huggingface, q4 quant, llama.cpp built for sm_61 (the pascal compute capability that everyone forgot exists). 31 tok/s gen speed, faster than most people read. this is what happens after the bench. raw perf was the receipt for what fits. now we test what actually works. agent loops, tool calls, real coding tasks coming next. ten year old card, $150 used, running a current open weight model with a current agent. nothing exotic. just the right quant, the right kv cache trick, the right engine compiled for the right arch. tell me what gpu you have, i'll tell you what runs.




I had a chat about context + agentic engineering with Eric the founder of Repoprompt and a member of the rate limited podcast. I've learned a lot from Eric over the last 6 months, he has a great understanding of how to best utilise AI agents. Enjoy











Local AI landing page generation on a DGX Spark. One Gemma-4-26B Q4 GGUF served by llama.cpp with 7 concurrent decode slots. The orchestrator breaks “landing page” into 6 section briefs: hero features steps testimonials pricing CTA Then 6 Gemma instances generate the sections in parallel and stitch everything into one Tailwind page. ~3 minutes end to end. The best part: everything you just watched happens offline, forever. No one can turn it off besides my light company lol @googlegemma @NVIDIAAIDev


