Kaden
117 posts

Kaden
@schuttdev
building things with Hermes Agent & Claude | CS @ ASU
Tempe, AZ Bergabung Ocak 2025
35 Mengikuti50 Pengikut

@no_stp_on_snek That seems to make sense, it feels heavy compared to qwen3.5 9b
English

@LottoLabs hah I was just looking at the price of these... I'm in my low cost/perf era
English

Convert Gemma 4 12B it to ROCmFP4 format and used the MTP Assistant and I am hitting high 30s to high 40s on tok/s decode speed. Full context window. On Strix Halo Max 395+ 128 GB RAM. Looks like the Strix Halo Max 395+ is beating the 4bit quants people are posting on the spark.
As @barackomaba would say "Chadrock"
English

Open heart RTX 3090 surgery on @ivanfioravanti's Zotac card.
The card was very old and was easily hitting 90 C under load. Original pads were baked, and paste turned to dust.
We're switching the thermal interface and will send him full pre and post benchmarks after the operation.
For this we're using @Thermal_Grizzly phase-change pads on the GPU core, non-conductive and rated to hold forever. Fresh pads on the memories.
Doing this work on every single @luceboxai machine we produce.

English

@LottoLabs Haha thanks, it’s been fun posting to your site! Appreciate the shoutout, looking into doing some of your evals soon too
English

pretty cool that the highest TPS on qwen 27b is this run
localmaxxing.com/runs/cmp8fw36n…
English

Part 2/30 of the LLM Series: RoPE (Rotary Position Embedding)
How does a transformer know the difference between -
"the dog bit the man" and "the man bit the dog"?
The words are almost identical, but the meaning changes completely.
RoPE encodes position as rotation, allowing transformers to understand relative order through geometry.
Read more:
tensortonic.com/llm-internals
English

@PatrickToulme “Hardware agnostic” stacks sacrifice efficiency and performance for portable mediocrity.
English

"Hardware-agnostic" AI stacks run everywhere and run fast nowhere. The race isn't to build the most portable stack — it's to build the deepest one.
Patrick C Toulme@PatrickToulme
English

The 7900 XTX is a great card and RDN3 is well supported. I was just gaming on it when one day I decided to give Ollama a whirl, then ComfyUI and was like... maybe I'll buy another $740 bucks on ebay - heck ya, sold!
I knew what I was giving up w/ the 9700's but yeah I feel like the XTX is just an under appreciated good value.
English

@Cryptol33t_NFT @1337hero The 7900xt has 20gb vram and less compute, than the 7900xtx, but on my card I get ~45 tok/s decode on Qwen 3.5/6 27b
English

@1337hero @schuttdev can the 7900xt run 27B or 31B at resonable toks?
English














