Mark
46.4K posts

Mark
@mwotton
freelance goblin technologist





single RTX 3060. 12GB VRAM. AMD EPYC 7282. Qwen 3.5 9B Q4_K_M running through Hermes Agent. 5.3 gigabytes of model on a card most people bought to play Warzone. compiled llama.cpp from source. loaded a 9 billion parameter model that outscores models 13x its size on reasoning benchmarks. plugged it into a full autonomous agent with 29 tools, terminal access, file operations, browser automation, persistent memory across sessions. will run the same octopus invaders prompt i've been testing across every config. Qwen 3.5 27B dense on a 3090. the 35B MoE. the 80B coder. hermes 4.3 36B. now the 9B on a 3060. same test. same standard. different floor. the 3060 has more VRAM than the 3070. 12GB vs 8GB. the most underrated budget AI card on the market and most people don't know it.


I'm mildly peeved at consciousness: I can have a clear & prolonged seeing of no-self, feel the ease & effortlessness that comes from that state, and then my mind still “decides” to revert back to the contracted “self inside the world” illusion. Wtf universe.





@aras_p look, I'm sorry, but the rule is simple: if you made something 2x faster, you might have done something smart if you made something 100x faster, you definitely just stopped doing something stupid


Terence Tao says that for the next 10-20 years, humans and AI will possess complementary strengths AI can synthesize a million papers and test every idea inside them. Humans can look at 5 examples and say, "I see the pattern now" "they can try to fake it, but they're very inefficient at it"









The Subtle Footgun of TVar (Map _ _) wherein I talk about livelock, concurrent data structures, and the semantics of what a reference really means (link in next post)







