Monkey Chief
631 posts

Monkey Chief
@MonkeyChief53
Just an IT guy who’s into Blockchain/Crypto, Ethical Hacking, and fitness. #GoPackGo🧀




My AI proxy setup in plain English - All my AI tools go through one shared control center - The registry keeps app settings consistent - The gateway checks access, chooses the right AI backend, and can fall back if needed - Behind it are local models / services + cloud providers


qwen 3.6-27b dense q4 on a single 3090 just knocked down 10 out of 10 tests at 40 tok/s on the first particle swarm benchmark i wrote for local agentic coding. it built the two files exactly to spec. then it used browser automation tools to open the page, read the test hud, find the failing tests, iterated through the code, patched tests.js, adjusted hue mapping, boosted mouse attraction force, and landed all 10 green checkmarks. this was actual dev behavior. not just generating code but also debugging its own output. i spent the next 8 minutes playing with the result. the boids flocking feels alive, the trail-blend is cinematic, three palettes cycle with space, mouse burst fires particles from click, drag paints a line through the swarm. this is what local ai looks like now. single 3090. hermes agent. no frameworks. no tricks. dropping the video next.






dude! the new qwen 3.6-27b dense is hammering my single 3090 at 100% gpu utilization. the spiky pattern on nvtop is the hermes agent autonomously thinking, calling tools, reading results, thinking again. this model is so cool to talk to. waits for tool outputs, reads them, selfcorrects, keeps going. no stalls, no loops, no hand holding. anyone running a single 3090 or any 24gb tier card should try this. same llama.cpp flags from last sweep, same hermes agent install. three commands and you are watching your own hardware think.

Qwen3.6-27B can now run locally! 💜 Run on 18GB RAM via Unsloth Dynamic GGUFs. Qwen3.6-27B surpasses Qwen3.5-397B-A17B on all major coding benchmarks. GGUFs: huggingface.co/unsloth/Qwen3.… Guide: unsloth.ai/docs/models/qw…








Used Codex Cli to profiled Qwen 3.5 9B Dense (Unsloth's UD-IQ3_XXS via llama.cpp) for Hermes Agent Tuning: > context length > batch size > tokens/sec > peak memory To squeeze every last drop out of an 8GB VRAM card







My house has 33 GPUs. > 21x RTX 3090s > 4x RTX 4090s > 4x RTX 5090s > 4x Tenstorrent Blackhole p150a Before AGI arrives: Acquire GPUs. Go into debt if you must. But whatever you do, secure the GPUs.



@sudoingX to be clear, which model / quantization did you run on the 3090?



