
Unwitty
362 posts

Unwitty
@unwitty
slop inquisitor father of local gremlins clanker-autist symbiosis enjoyer actual bbs/usenet/efnet-era dinosaur
24°37′43″N 82°52′24″W شامل ہوئے Mayıs 2008
712 فالونگ185 فالوورز

@Tono_Ken3 Yeah, I actually have two RTX6kPro Max-Q cards "suffocating" at x8/x8 on a 14700k/ASUS Maximus Hero z790. Prefill is quite a bit slower for sure, but I can't justify the $ to cpu/mobo/memory unless I'm also getting more 6k's.
English

Yep,12 x 2000 = RTX24000
I thought it was a good option. x16 to x8x8 bifurdcation
You might feel a slight effect with two cards. But when you add up four or eight cards, the effect will become non-linear.
In particular, the degree of freedom in combining card sets accelerates learning and data acquisition.
English


@antirez @pupposandro @ivanfioravanti thanks for the pointer. it's showing substantial improvement w/ ds4
English

@unwitty @pupposandro Try DS4, I and @ivanfioravanti did may tests recently, we discovered that many inference engines do not do inference correctly for DeepSeek v4 Flash, you get bogus results because the model is damaged during execution.
English

A 26B model on a 24 GB laptop tied a 284B model on a 192 GB Mac Studio.
Both 78.3% on ds4-eval-92, the eval ported from @antirez's ds4 (huge fans of all his work on it).
To be honest DeepSeek V4 Flash is squeezed to ~2-bit to fit the Mac, Gemma 4 26B runs at 4-bit. But a model an eleventh the size held even, and ran ~5x faster: 101 vs 20.7 tok/s, median answer 9.8s vs 144.8s, in 24 GB instead of 192.
The takeaway is not that Gemma is the better model. A small MoE that happens to be strong on your workload gives you that quality at a fraction of the memory and latency.
Incredible work by @davideciffa and @easel!
Sandro@pupposandro
English

@realcheeker Nice work and thanks for writing this up. I’m gonna give this a go on my dual system too!
English
Unwitty ری ٹویٹ کیا
Unwitty ری ٹویٹ کیا

recommended reading. i too am very done with people anthropomorphizing a bunch of matrices on a GPU cluster, especially if the same people do not give two fucks about actual human beings.
Armin Ronacher ⇌@mitsuhiko
More musings after some people got upset about the word clanker. lucumr.pocoo.org/2026/5/26/clan…
English

Unwitty ری ٹویٹ کیا

@Hikari_07_jp @unwitty i have a weird solution... i have a pool so was gonna use pool water to cool via a heat exchanger LOL
that way can also heat my pool at the same time
English

@unwitty @Hikari_07_jp i ended up getting 2 workstations, but yeah a part of me regrets not getting maxQ since i wanna keep it modular for upgrading to more
English

How many cards are you planning to get? If just one, WS edition makes most sense.
Otherwise, I'd go with Max-Q due to power and cooling advantages.
The Max-Q at 300w is only 5-10% lower performance than the WS at 600w. Undervolting the WS nullifies its perf advantage. Maybe @Hikari_07_jp can show us how this, given he has versions of the cards?
Even when undervolting, your PSU will still need to support 600w+ due to spikes.
The Max-Qs exhausts through the rear and can be safely placed immediately next to each other. They're designed for multi-GPU builds within the same cabinet.
English

@Hikari_07_jp are you power limiting the workstation? i was thinking about getting maxQ but i was like i can just smi them to 350W anyways so might as well get the workstation.
English

@realcheeker Yep that makes sense. There are scenarios where compute becomes more and more reserved for the highest paying customers.
We see it with GPU pricing and the shift from consumer-grade to enterprise, but the frontier labs don't seem to be pricing APIs that way yet...
English

Anthropic's poor comms is enraging, but the important signal here is that "unlimited" plans are going away.
These subsidized Max/Pro plans are not economically viable, and exist only because they provided valuable training data to the labs.
Ralph loops, OpenClaw, and other uses that have low human-input to API call ratios, provide substantially less value to the labs.
As such, I don't blame Anthropic for cutting out those use-cases. I see devs hopping to Codex, but the reality is that eventually OpenAI will need to do the same.
I've been wondering if we'll end up with a model similar to the freemium video game space, where the whales spend heavily and the plebs provide the playing field, or training data in this case.
SIGKITTEN@SIGKITTEN
We've truly been damaged by anthropic not letting u do anything that it seems unintuitive when OpenAI just lets u use their API endpoints like... API endpoints Mad respect to the @OpenAI team for not only being normal about it but I also got a "lmk if u need help" DM!
English
Unwitty ری ٹویٹ کیا

I hear you and I don't fully disagree, but mostly, admit I'm not in the head of others, so I cannot judge motivation, only interpretations of their behavior.
I've been passionate about open source since first installing minix in ~1995. I spent way too much of my life raging at MS's various strategies to keep Linux from taking off at the turn of the century. Those were my passionate 20s.
I took note of 0xSero's first declaration about open source must win, and felt some indignation, because it seemed to have come out of left field, with little depth.
That said, to my experience, when I cross from assessment of behavior into character judgement, it's usually because I'm seeing something that exists within myself, reflected back at me by this other person. And in that case, i was seeing my own lack of material contributions to open source over the years...for fear of "doing it wrongly," which is what everyone is coming down on him for now. Maybe that's partially why I have a soft heart here.
The post you linked appears looks like someone shooting their shot. And by the looks, it worked, as someone from Google replied, offering an interview. What is the judgement here? Is it that someone unworthy is succeeding? That other people just don't see it and are helping him succeed?
Is wanting followers or showing that you want followers the real problem? Few would turn away followers.
IDK, these are the things I ask myself.
English

i understand where youre coming from and appreciate your nuance, i think you just lean more on the kinder side
personally, fair re first part, although, he is old enough to have a kid, he's old enough to yap on the internet to get $, he's old enough to write the most water-like, completely trivial spiel on open-source, then i think he's old enough for criticism
bro's already out there trying to grab some more: x.com/0xSero/status/…
re: Ahmad, these kind of guys (just like Sero) have never been popular or excellent at anything and any x fame tends to oneshot them, that much I agree.
"wanting followers" is already a very strong signal that person has nothing to offer
English

state of this app:
guy who psyops kids into "buying compute, go broke if you have to" calls out the most obvious grifter around (rightfully so as 0xSero is a showcase of how easy it has become to larp for "free money" and attention)
meanwhile the masses applaud and choose to support whoever they identify with the most (the epic "LOCAL MODEL" guy (read a couple industry people's pieces 2 yrs ago) or the "OPEN SOURCE MUST WIN" (discovered open source 6 months ago, vibecodes garbage to get 100k+ USD so far)
meanwhile literal idiots or just overall clout-demons who want visibility on X help fund the whole thing

English
Unwitty ری ٹویٹ کیا

@mempirate I am curious why Anthropic didn’t use Mythos to find and fix the issues in Bun.
English









