Mr Q
99 posts

Mr Q
@SW_Quigley
Serial entrepreneur, 3 exits. Founder @ Platform Q AI. Chair @ SHIFT Marketplace. Forbes Council. Building the future of AI agents.

MoE vs dense offload on 8GB VRAM MoE offload is 10.8x faster than dense offload on 8GB VRAM. here's the proof. I tested Qwen3.6 35B A3B (MoE, 3B active) vs Qwen3.6 27B (dense, 27B active) on my RTX 4060 Ti 8GB. the numbers: >MoE (-ncmoe 30): 35.4 tok/s >dense (-ngl 20): 3.28 tok/s ratio: 10.8x it gets worse at longer context. at 24K tokens, the gap is 16.7x. MoE has zero context degradation (SSM layers), dense loses -35.4%. why: MoE expert offload keeps the hot path (3B active params) entirely in VRAM. only inactive experts move to CPU when selected. dense layer offload splits every layer across GPU and CPU. every token bounces through PCIe for all 64 layers. the bandwidth bottleneck is fatal. quality is slightly better on dense (5/6 vs 4/6). the 27B model has the best hallucination resistance of all 9 models I tested. if you have 8GB VRAM and a model that doesn't fit: MoE with expert offload, not dense with layer offload.








this is what my setup looks like today. about to test qwen 3.6 27b dense q4 on a single rtx 3090 at ~41 tok/s gen, hermes agent driving. predecessor model qwen 3.5 dense q4 made it work in one iteration when i ran the same agentic build on the same card. i've been daily driving qwen 3.6 27b dense for weeks now, the model i keep coming back to. if 3.6 oneshots too, this becomes the best model that runs on a single rtx 3090. consumer tier king. firing the test now will report back soon.






Interesting how it works Elon puts up his own money, rounds up the absolute best AI talent on the planet, leverages every connection he has to secure serious resources, and launches OpenAI in 2015 as a pure non-profit explicitly created to develop AI for the benefit of humanity, with zero profit motive and open research Then the “team” decides they want the bag They push Elon out, take control, and quietly flip the entire thing into a for-profit machine All while preaching the same sanctimonious lines on repeat: “We’re still mission-driven!” “AI for the good of humanity!” “We’d never abandon our principles!” The ultimate betrayal: Elon got zero equity. Not a single share. He funded it. He built the foundation. He got nothing while they turned his non-profit into their personal cash cow This is the level of betrayal and hypocrisy we’re dealing with And for the record.... this lawsuit doesn’t put a single penny in Elon’s pocket. Any win goes straight back to the non-profit to restore the exact mission he founded


Introducing Claude Design by Anthropic Labs: make prototypes, slides, and one-pagers by talking to Claude. Powered by Claude Opus 4.7, our most capable vision model. Available in research preview on the Pro, Max, Team, and Enterprise plans, rolling out throughout the day.












