Justin Thyme 🇺🇸🐿️

1

13

sdmat@sdmat123·12h

@looking5452 @kdaigle @github I know someone who prefers low end models because they are fast, literal and unambitious. If you know exactly what you want and express it with precision the technical scribe may be as good as an ersatz engineer. This in no way means haiku is as capable as opus.

English

0

1

16

Kyle Daigle@kdaigle·1d

Hot take from looking at @github Copilot telemetry: benchmarks make coding models look wildly different. Production workflows make them look much more similar. 👀 We looked at 23M+ Copilot requests and examined one simple metric: code survivability.

English

27

40

308

58.6K

Justin Thyme 🇺🇸🐿️@looking5452·12h

@sdmat123 @kdaigle @github who tf is committing haiku code ffs

English

0

1

17

sdmat@sdmat123·19h

@kdaigle @github This is like studying percentage of food sent back to the kitchen vs. knife brand

English

0

1

252

Justin Thyme 🇺🇸🐿️@looking5452·13h

@ProofOfCash @iMilnb I should be clear: these scores are WITH thinking without thinking, fp16 = ~50-55% and q8_0 = nearly 0%; totally borked

English

10

ProofOfCash ⚔️ URSF@ProofOfCash·13h

@looking5452 @iMilnb Thank fuck. I was beginning to think I was just a schizo

English

0

1

12

iMil 🇪🇸🦇@iMilnb·1d

Epic Reddit post on running Qwen3.5 35B q4 on a 16GB RTX 5080 at 75 tp/s! reddit.com/r/LocalLLaMA/s…

English

7

11

103

9.7K

Justin Thyme 🇺🇸🐿️@looking5452·13h

@ProofOfCash @iMilnb I did long context evals in all major quants and kv quants for this model and q8_0 definitely is NOT a free lunch BEST result at q8_0 was 57% recall; f16 on most quants was easily >97%; major fail at q8_0

English

0

1

18

ProofOfCash ⚔️ URSF@ProofOfCash·17h

@iMilnb It's very handwavey when it comes to the KV cache quantization. Of course short sequences won't show degradation at q8_0. I don't think that it's a free lunch if you use the model for agentic coding.

English

0

3

329

Justin Thyme 🇺🇸🐿️@looking5452·14h

@DOJNatSec

GIF

QME

57

National Security Division, U.S. Dept of Justice@DOJNatSec·18h

Three Charged with Conspiring to Unlawfully Divert Cutting Edge U.S. Artificial Intelligence Technology to China “The indictment unsealed today details alleged efforts to evade U.S. export laws through false documents, staged dummy servers to mislead inspectors, and convoluted transshipment schemes, in order to obfuscate the true destination of restricted AI technology—China,” said John A. Eisenberg, Assistant Attorney General for National Security. “These chips are the product of American ingenuity, and NSD will continue to enforce our export-control laws to protect that advantage.” 🔗: justice.gov/opa/pr/three-c…

National Security Division, U.S. Dept of Justice tweet media

English

260

1.5K

4.7K

4.2M

Justin Thyme 🇺🇸🐿️@looking5452·16h

@EricRichards22 copilot cli is much more coherent the application itself is trash but it... does what it's supposed to (at least for rust, bash, and python)

English

JimBobSquarePants 🇺🇦@James_M_South

1

13

Eric Richards@EricRichards22·1d

Copilot does this kind of thing extremely often, just fails to run tools, or runs them and then can't parse the output, and gets itself in a doom-loop of retrying with different parameters

Worse than useless actually. This is aggressively incompetent.

English

4

0

5

557

Justin Thyme 🇺🇸🐿️@looking5452·16h

@rah_66_comanche

GIF

QME

1

10

RAH-66 COMANCHE@rah_66_comanche·16h

@looking5452 yes but I haven't watched the video yet ;)

English

Synth Potato🥔@SynthPotato

0

1

20

RAH-66 COMANCHE@rah_66_comanche·2d

My entire life I have been perennially astounded that most people just react to individual words alone, they don't actually think about what is being said at all.

NVIDIA CEO Jensen Huang confirms DLSS 5 is not post processing or an upscaler, but generative AI being utilized at the geometry level. A literal AI slop filter on top of games, the very thing the yes-men have been denying since it got announced. This is inherently anti-art.

English

0

9

263

Justin Thyme 🇺🇸🐿️@looking5452·16h

@rah_66_comanche the linked video is the source and it's presented pretty well and with careful delivery of sources... seems likely legit

English

0

19

RAH-66 COMANCHE@rah_66_comanche·16h

@looking5452 Or the person making that thread. Equally likely.

English

Osvaldo Pinali Doederlein@opinali

0

1

22

Justin Thyme 🇺🇸🐿️@looking5452·16h

@rah_66_comanche welp I suppose I shoulda known better... x.com/opinali/status…

Daniel got important clarifications from NVIDIA. TLDR: the "DLSS5 skeptical" were right about *everything*. 1⃣ It's a 2D AI Filter. Input is only color buffer & motion vectors. The model doesn't see geometry, lights, PBR properties, normals, anything🧵 youtu.be/D0EM1vKt36s

English

0

39

RAH-66 COMANCHE@rah_66_comanche·1d

@looking5452 It would look REAL fucking weird if it didn't affect shaders.

English

Justin Thyme 🇺🇸🐿️@looking5452

0

1

19

Justin Thyme 🇺🇸🐿️@looking5452·19h

@LottoLabs details/settings for my fellow 16gb lowbies: x.com/looking5452/st…

unsloth Qwen3.5-122B-A10B IQ2_M works EXTREMELY well on modest modern hardware at about 19-21 tps (benches at 23 tps gen, 224 prompt), VERY strong long context, all other standard benches show minimal (<3-5%, many <1%) degradation from baseline; extremely usable, smarter than 35B (50% generation speed but approx == wall clock time to solution due to fewer thinking tokens) rtx 5060 ti 16gb, ryzen 7 9700x (at 85W limit), 64gb ddr5 at 6000mt/s --threads 11 \ --threads-batch 13 \ --gpu-layers 99 \ --n-cpu-moe 45 \ --ctx-size 262144 \ --predict 32768 \ --batch-size 512 \ --ubatch-size 512 \ --parallel 1 \ --kv-offload \ --cache-type-k f16 \ --cache-type-v f16 \ --flash-attn on \ --fit on something magical about this quant; keep token probability tight

English

1

13

Justin Thyme 🇺🇸🐿️@looking5452·2d

@LottoLabs 122b-a10b is goat for 16gb card havers fwiw

English

0

4

825

Lotto@LottoLabs·2d

Qwen 3.5 models ranked on 3090 W/ hermes agent. 0.8b: for fun, cpu usage, don’t expect much but it runs on anything 2b: starting to be usable, can do small tool calls (not super reliably), drifts from tasks easily, major steering required 4b: actually usable, follows tool calls reliably, follows skills reliably (major bonus), doesn’t drift from tasks as bad as 2b. 9b: all of 4b but more capable w/ more complex tasks, still needs steering, still not 1 shoting tasks but more intelligent than the smaller models A3b: fast, more general intelligence, feels like the 9b speed but the reasoning closer to 27b, follows tool calls and complex skills well, minimal drift, just lacks big model coding abilities. 27b: the 3090 goat imo, no drift, tool calls for days, writes and follows skills very well, feels like sonnet 3.6-4 range of knowledge level with less glazing, code is usable and can deal w/ multiple files in projects. General knowledge level just feels higher. Only downside is it is slower than A3b and 9b obviously.

English

35

36

548

36.1K

Justin Thyme 🇺🇸🐿️@looking5452·20h

@Mike562389 during use I set around 160k because I don't trust models over 128k, tho I have no measurable reason in this case... just a habit to only trust a little more than half max context

English

0

2

17

MikeBot@Mike562389·1d

@looking5452 What context limit do you set?

English

0

2

16

Justin Thyme 🇺🇸🐿️@looking5452·1d

unsloth Qwen3.5-122B-A10B IQ2_M works EXTREMELY well on modest modern hardware at about 19-21 tps (benches at 23 tps gen, 224 prompt), VERY strong long context, all other standard benches show minimal (<3-5%, many <1%) degradation from baseline; extremely usable, smarter than 35B (50% generation speed but approx == wall clock time to solution due to fewer thinking tokens) rtx 5060 ti 16gb, ryzen 7 9700x (at 85W limit), 64gb ddr5 at 6000mt/s --threads 11 \ --threads-batch 13 \ --gpu-layers 99 \ --n-cpu-moe 45 \ --ctx-size 262144 \ --predict 32768 \ --batch-size 512 \ --ubatch-size 512 \ --parallel 1 \ --kv-offload \ --cache-type-k f16 \ --cache-type-v f16 \ --flash-attn on \ --fit on something magical about this quant; keep token probability tight

English

0

3

131

Justin Thyme 🇺🇸🐿️@looking5452·1d

I witnessed a chase when I was a kid the culprit drove behind some bushes which my friends and I didn't quite understand why (yet) because the police were so far behind; when the cops arrived 20 seconds later we put 2 and 2 together and all pointed at the bushes screaming which the cop took note of and reversed and went behind them and flushed him out; very exciting for little me!

English

6

Kensetsu@Kensetsu6·1d

I saw a slow speed chase today, spike strips and all. I was just trying to get lunch. Interesting to see. I hope they’re all okay. The cops had to stack up with their guns drawn. A tough day for at least a couple people there.

English

0

8

76

Justin Thyme 🇺🇸🐿️@looking5452·1d

@ID_AA_Carmack my wife thought I was weird for getting an aux cable for the car until I demonstrated the difference between aac+sbc and flac+aux now she thinks I'm REALLY weird

English

@EOlonov @LottoLabs x.com/looking5452/st…

2

76

John Carmack@ID_AA_Carmack·2d

When you stream Spotify to Bluetooth speakers or headphones, the audio comes over the network lossily compressed with Vorbis or AAC codecs, is then decoded on your device to 48 Khz raw samples, then the Bluetooth stack lossily re-compresses it with SBC or AAC codecs before sending it over the airwaves to the speakers. I don’t have “golden ears” to pick apart audio quality like I can with, say, missing gamma correction on texture filtering, but that still hurts my system optimization soul. It is likely over-optimization, but It would be cleaner if there were a way to send bluetooth-ready, compressed audio directly.

English

272

241

5.8K

438.2K

Justin Thyme 🇺🇸🐿️@looking5452·1d

Justin Thyme 🇺🇸🐿️@looking5452

unsloth Qwen3.5-122B-A10B IQ2_M works EXTREMELY well on modest modern hardware at about 19-21 tps (benches at 23 tps gen, 224 prompt), VERY strong long context, all other standard benches show minimal (<3-5%, many <1%) degradation from baseline; extremely usable, smarter than 35B (50% generation speed but approx == wall clock time to solution due to fewer thinking tokens) rtx 5060 ti 16gb, ryzen 7 9700x (at 85W limit), 64gb ddr5 at 6000mt/s --threads 11 \ --threads-batch 13 \ --gpu-layers 99 \ --n-cpu-moe 45 \ --ctx-size 262144 \ --predict 32768 \ --batch-size 512 \ --ubatch-size 512 \ --parallel 1 \ --kv-offload \ --cache-type-k f16 \ --cache-type-v f16 \ --flash-attn on \ --fit on something magical about this quant; keep token probability tight

QME

22

0xE  @enol.app@EOlonov·1d

@looking5452 @LottoLabs How do you run it. I got 70tps for 35b (80k context) one and less than 10 for 122b one.

English

0

1

29

Justin Thyme 🇺🇸🐿️@looking5452·1d

@Duderichy that is quite the x axis

English

BuccoCapital Bloke@buccocapital

21

the Rich@Duderichy·1d

what

Wow. Anthropic is eating OpenAI’s lunch in the enterprise

English

4

0

39

3.5K

Justin Thyme 🇺🇸🐿️@looking5452·1d

@rah_66_comanche I think there's a good argument to be made about the guy's uncanny valley comment, but probably not in the way he meant it one takeaway from the demo is that the highly photorealistic appearance contrasts VERY hard against unrealistic movement, esp in the keyframe animated games

English

0

1

11

RAH-66 COMANCHE@rah_66_comanche·2d

Yeah you know what it absolutely was not fucking delivering? Real lighting. That's why the last of us looks like everything was shot in a light box lit by 1 LED, and the death stranded chick looks like her skin is made of soft touch lightbulbs.

Nyx@justnyxs

We don’t need DLSS filter slop. Gaming was already delivering real emotion, real performances, and real immersion long before AI upscaling became a crutch... and it’ll keep doing that for decades to come.

English

0

10

352

Justin Thyme 🇺🇸🐿️@looking5452·1d

@rah_66_comanche nothing that I said conflicts with what he said I agree, it's not post-processing; it's rendering with the richness of a deep knowledge of what's present underneath; this is a spin-off of their AI textures; same concept, bigger application

English

0

16

RAH-66 COMANCHE@rah_66_comanche·1d

@looking5452 "It’s not post-processing, it’s not post-processing at the frame level, it’s generative control at the geometry level," he said.

English

0

1

24

Justin Thyme 🇺🇸🐿️@looking5452·1d

for one as big as this, it definitely is solid; obviously a dense would be less affected, but out of curiosity just took all the lower quants that I could get to run at all and ran them through several evals and for some reason this specific quant was like magic; 3-bits were all garbage, same with the other 2-bit quants; there's something special about this specific quant

English

4

145

Lotto@LottoLabs·1d

@looking5452 Cool I should know better than to shit on smaller quants than Q4, all that matters is if it works

English