Currently digging into my Qwen3.6 27B evals and some results are… weird.
It’s clearly better than Qwen3.5 27B on some tasks, but worse on others, especially instruction-following.
I also couldn’t reproduce their published GPQA Diamond score. In my setup, Qwen3.5 is significantly ahead. When I see stuff like this, I usually check Artificial Analysis, and they seem to get similar results Qwen3.5 >> Qwen3.6 on this one.
I’ll share more next week once the full analysis is done, but right now Qwen3.6 27B doesn’t look like the obvious pick some people are making it out to be.
Gemma 4 31B, or even Qwen3.5, can still be much better depending on whether you care about accuracy, efficiency, or the specific task.
@JamesNumb3rs I’m using the UD-IQ3_XXS gguf for both 27B and 35B. Only have a single RTX 5080 (16gb)
Try to use q8_0 kv cache (or no quant), I’m using q4_0 and can feel it doesn’t help
APEX quantizations of more models ongoing!
Meanwhile, playing with Qwen 3.5.. the impact of APEX vs Unsloth Dynamic quant on quality is clearly visible IMO, at least in some areas.
I know we need more numbers before drawing conclusions, but this isn't about numbers. Just check out a simple prompt: "create an html page of a rotating cube in SVG."
Left: Unsloth Qwen3.5-35B-A3B-UD-Q8_K_XL.gguf (48.7 GB, ~32 tok/s) → flat square (?????)
Right: APEX Qwen3.5-35B-A3B-APEX-I-Quality.gguf (22.8 GB, ~53 tok/s) → ✨
@bnjmn_marie even with -np X and perhaps several server copies at the same time?
maybe it would be nice to have a proper comparison of the engines too.
List of quantized Gemma 4 31B I’m evaluating:
- Intel/gemma-4-31B-it-int4-AutoRound (19.2 GB)
- cyankiwi/gemma-4-31B-it-AWQ-4bit (20.5 GB)
- RedHatAI/gemma-4-31B-it-NVFP4 (23.3 GB)
- nvidia/Gemma-4-31B-IT-NVFP4 (32.7 GB)
- RedHatAI/gemma-4-31B-it-FP8-block (33.3 GB)
→ yes, NVIDIA’s NVFP4 checkpoint is as large as an FP8 checkpoint. This is what happens when you don’t quantize the attention layers of a dense model.
@bnjmn_marie generally the most interesting question is what can you fit into 11GB, 15GB, 23GB, 31GB... past that, it's just macs and rtx pros, and those can run almost anything anyway.
@bnjmn_marie how so? llama with quants is consistently faster than vllm, at least every time I tried.
also, maybe the battery could be reduced, small models - Q4_K_M, maybe IQ4_NL & IQ3_XSS, + some smarter Q2s on 200B+ models?
those are probably the only ones that need to be tested really
@0xSero I would really appreciate a 12GB level for all the 3060 owners please.
Are we better off with 9B variants or trying the 2.5 bit Unsloth 27B version? Or the A3B?
@LenSeaside@stevibe 27B UD-IQ3_XXS [-ngl 65 + 36K Q4 kv cache] 1100-1200pp 36-37 t/s
but only if you connect the display to motherboard/CPU's iGPU, that will get you 1-3GB VRAM back.
@stevibe Can you include 9B in your analyses please?
I only have a 12GB GPU.
I would be very interested to know if it's much worse or not that bad etc. and what I can do to help it along in terms of where it fails.
"122B has to be smarter than 27B"
I showed 4 UI components to three Qwen3.5 models and asked them to recreate them from a screenshot alone:
- 27B (dense)
- 35B-A3B (MoE)
- 122B-A10B (MoE)
Same screenshot. Same prompt. Same task.
Which one do you think nailed it?
I made a dataset from every AI chat and session I ever made, and passed it to Qwen3.5-35B
- 7.6% of the model handled for 50% of my requests
- 27.5% over 80% of my requests
That means I can technically get any model to 7.6%/3.8% in fp8 someone give me 1 million pls
@Ben68638515@jonasgahrstore@bundeskanzler good buddy, Art. 69(a) clearly states that insane people shouldn't post on the interwebz without a professional supervision, & your posts are in a clear violation. please rectify your misconduct immediately
also - check yourself before you wreck yourself. it's not looking good.
@eldeffo@jonasgahrstore@bundeskanzler You are free to present scientific arguments against every single one of the 10 geophysical evidences. Until you have done that, your post represents intentional misinformation violating Art. 70(c) of the ICC Rome Statute.
A strong partnership between Norway and Germany will be even stronger. On energy, industry, defence, climate, space and support to Ukraine. Thank you @bundeskanzler Friedrich Merz for a substantive meeting in Berlin. It all amounts to mutual European security. (Photo: Uwe Koch)
Did you in your conversation with BK Merz also address the fact that it was your very own, Norwegian tax funded seismic agency NORSAR (joint venture with US Los Alamos Nuke Lab) which deliberately covered up the nuclear nature of the destruction of Nordstream:
Indeed, Nordstream was nuked (sic!) as part of the US/NATO extortion racket:
US LNG export capacity increased from ~0 in '16 to that of Nordstream 1+2 in '20 (within few percent) before Nordstream was destroyed by a Mini-Nuke under US/NATO auspices exactly on Donetsk Referendum Day, serving as covert shock wave attack towards Kaliningrad as evidenced by seismic measurements. This was indeed a US/NATO masterstroke: The covert nuclear nature of this attack subjugated DE and DK unconditionally to US/NATO orders, and coerced SW & FI into NATO.
A summary of my evidence that proved the nuclear (!) destruction of Nordstream 1+2 beyond the shadow of a doubt was presented to the UN Security Council on Sept 26 2023 (SC/15422). Instead of following up on my report, responsibility was offloaded to authorities of Sweden, Denmark and Germany with the former two promptly aborting their investigations shortly afterwards, while the matter was - to this very day - intentionally buried by DE.
x.com/Ben68638515/st…
@RALee85 Even if it's 10x worse, the Russians will prevail over Ukraine. That's the goal and nothing will stop them. Looking back at russian history, even the most deranged russophobes can't picture a scenario where the SMO can be turned around in Ukraine's favor.
“Guided by President Vladimir Putin's crack team of economic officials, Russia has reported better average growth than the euro zone over the past four years despite being hit by some 24,000 Western sanctions.
But high interest rates, higher taxes, rising prices and a $20-per-barrel discount for Russian oil are taking their toll - even in Moscow, a vast urban area of 22 million people that has been largely insulated from the worst impact of Europe's deadliest war since World War Two.
"To let" signs are prominent in retail spaces across the capital. Sales of new light commercial vehicles and trucks, a good indicator for the health of the retail and construction industry, fell by 38% to 147,000 units in 2025 and have continued to fall in the first weeks of 2026, Autostat said.
Data from Sberbank, which as Russia's biggest bank sees the ripples of expenditure across the economy, showed that the fall in the number of catering outlets in January was the biggest since 2021 and that restaurant spending hit the lowest in three years in November-early December 2025.”
reuters.com/business/russi…
Velké poděkování Slovákům. Mohli jste nám teď ty dva roky našeho pošklebování dát pěkně sežrat, ale neudělali jste to. Naopak projevujete nesmírnou účast. Ještě jednou díky.
@Posledniskaut neboj, tie 2 roky ceskeho kokotizmu zostali ulozene v pamati.
a na to, ze cesi kompletne prestali vnimat rozdiel medzi realitou a svojimi smiesnymi predstavami o svojej nadradenosti sme si uz davno zvykli.
ze delate machry a hajzl mate na chodbe, nie je nic nove.
@BarronTNews_ didn't stun anyone, noone in EU gives a damn about this clown.
also, will 💯% end up in jail for all the corruption after the next election.
🚨 HOLY CRAP. Slovakia PM Robert Fico just STUNNED the EU after speaking with Donald Trump and Marco Rubio about Greenland.
“Trump is clearly pursuing the nation-state interests of the United States. If the EU acted the same way, we would be in a completely different position.”
BOOM. 🔥
And then he went for the jugular.
“World leaders do not take the EU fully seriously. That’s because of our nonsensical climate targets and our suicidal migration policy.”
That’s the truth they’re terrified to say out loud.
America First works. National interest works. Strength works.
This guy is BASED. 💯💯
just trying running GLM 4.7 Flash locally in Open Code
terrible experience
I got 96 GB memory, model ran so slowly it was unbearable
anyone saying open source models running on your computer are the future, they're either lying to you or are insanely delusional lol