ashpool

11.4K posts

ashpool banner
ashpool

ashpool

@4shpool

capitulating

127.0.0.1 Katılım Şubat 2021
4.8K Takip Edilen6.6K Takipçiler
Sabitlenmiş Tweet
ashpool
ashpool@4shpool·
Something the turtleneck boys won't appreciate - That's 3x tok/sec vs the guy with a 512 Mac Studio, BF16 and 16k context (armchair LLM dabbler, asks promethean intelligence "why is the sky blue?") - We're at half the price, opencode session >100k context, never compacting (chad, generational slop creator) You can just do things 1. use a lower quant 2. buy a gpu
ashpool tweet media
ashpool@4shpool

Did some half-baked experiments with GPU power limits to see how it affected inference performance on Minimax M2.5. TL;DR: - unconstrained 350W/GPU limit on 6x RTX3090 gave the best performance, and perhaps counterintuitively, was most efficient - Minimax doesn't use all the power I give it. I attribute that to MoE requiring fewer operations per token, but idk - Nerfing your system in the name of reducing the power bill might not actually help you Blog: llmgarage.ai/power-limit-to…

English
2
0
5
2.1K
Sudo su
Sudo su@sudoingX·
how much VRAM do you have right now
English
72
2
42
3.5K
ashpool
ashpool@4shpool·
Here I go again Listening to Sean Paul's international affair While training my digital doppelganger / inevitable usurper
GIF
English
0
0
0
89
ashpool
ashpool@4shpool·
@bnjmn_marie You are the only person on this planet that understands KV cache math. Thanks for sharing 🙏
English
0
0
1
22
Benjamin Marie
Benjamin Marie@bnjmn_marie·
Nemotron Cascade 2 has the same KV cache as Nemotron 3 Nano: attention layers: 6 kv heads: 2 head dimension: 128 So the KV cache grows by only 6,144 bytes per token per concurrent sequence. At 32k context, batch/concurrency = 1, that is about 0.20 GB...
English
3
2
33
2.1K
am.will
am.will@LLMJunky·
Finally proud to announce that I've joined the GPU Minor Leagues. 2 x RTX 6000 Pro. I have six months to pay off the second GPU lol. You are all TERRIBLE influences.
am.will tweet media
English
111
13
797
40.8K
ashpool
ashpool@4shpool·
I must say Qwen3.5-35-a3b (I use unsloth IQ4_NL) absolutely cooks on a MacBook M4 Best model in the 30B class that I've used (opinion). Prefer MoE to dense for speed, ofc
English
2
0
5
378
0xSero
0xSero@0xSero·
We did it boys, Kimi running on 8x 3090s and 256gb of DDR4 7~ tokens/s for generation. This is pre-reap enabled salience optimizations. This should get up to a stable 25~ tokens/s for generation and 200~ prefill I’m hoping, at least
0xSero tweet media
English
15
7
152
24.1K
ashpool
ashpool@4shpool·
@SpaceMatthieu @levelsio Ctrl+b + s --> pick the session. Also set up a 2x2 grid to increase the autism. The shortcuts are not intuitive to me but you just gotta memorize them
English
0
0
4
391
Matthieu Richard
Matthieu Richard@SpaceMatthieu·
@levelsio Is there a good way to jump between tmux sessions on Termius? I find it quite hard to manage multiple codex/claude sessions on the go
English
36
1
14
223.2K
@levelsio
@levelsio@levelsio·
Are you guys aware I am coding mostly on my phone now all day via Termius to Claude Code on my server while I go with gf to the dentist, clothing store, cafe, etc. 😛✌️
@levelsio tweet media
rootkid ✌️@rootkid

@levelsio "You" ➡️ IP your Internet provider assigns you; not your servers IPs. If you had a static IP I'd like to know why you prefer Tailscale over just adding e.g. your company IP to the firewalls SSH whitelist.

English
320
88
2.1K
680.2K
ashpool
ashpool@4shpool·
@gumsays You take 20-30k steps per day in Italy
English
0
0
2
197
gum
gum@gumsays·
man how can I eat tons of bread butter and cheese in italy and not get bloated or fat but at home I become obese
English
19
4
45
7.2K
ashpool
ashpool@4shpool·
@web4O The year is 2030 And Larry Ellison's net worth is eclipsed by an SF elevator mechanic who just goes by "Todd"
English
0
0
1
45
ashpool
ashpool@4shpool·
@walls_jason1 Teach a man to fish, he'll eat for a day Teach a man the NEC + UL 508A; he'll eat for a lifetime
English
0
0
0
73
Jason Walls
Jason Walls@walls_jason1·
Yesterday Mark Cuban reposted my work, DM'd me, and told me to keep telling my story. So here it is. I'm a Master Electrician. IBEW Local 369. 15 years pulling wire in Kentucky. Zero coding background. I didn't go to Stanford. I went to trade school. Every week I'd show up to a home where someone just bought a Tesla or a Rivian. And every time, someone had already told them they needed a $3,000-$5,000 panel upgrade to install a charger. 70% of the time? They didn't need it. The math is in the NEC — Section 220.82. Load calculations. But nobody was doing them for homeowners. Electricians upsell. Dealers don't know. And the homeowner just pays. I got angry enough to build something about it. I found @claudeai. No coding experience. I just started talking to it like I'd explain a job to an apprentice. "Here's how load calcs work. Here's the NEC code. Now help me build a tool that does this." 6 months later — @ChargeRight is live. Real software. Stripe payments. PDF reports. NEC 220.82 calculations automated. $12.99 instead of a $500 truck roll. I'm still pulling wire. I still take service calls. I wake up at 5:05 AM for work. But something shifted. Yesterday @vivilinsv published my story as Claude Builder Spotlight #1. Mark Cuban saw it. The Claude community showed up. And for the first time, I felt like this thing I built in my kitchen might actually matter. I'm not a tech founder. I'm a dad who wants to coach little league and be home for dinner. I just happened to build something that helps people. If you're in the trades and thinking about using AI — do it. The barrier isn't technical skill. It's believing you're allowed to try. EVchargeright.com
English
604
2.2K
16.3K
880.2K
Layer33
Layer33@Layer_33_·
Another week, more DMs of validators shutting down. "Well, it's good to trim the fat, right?" Yes, but these validators offer something unique: network diversity. Solana is in crisis. We're here to fight back. Support Solana, stake with Layer33 👉 Layer33.com
Layer33 tweet media
English
7
4
19
938
ashpool
ashpool@4shpool·
I'm looking for $20k to buy two Nvidia 6000 pro I'm offering 0.001% of my company For building the machinery of technocratic authoritarianism
GIF
English
1
0
6
270
ashpool
ashpool@4shpool·
You guys really should read Player Piano. Is an easy read. Gif unrelated
GIF
English
0
0
2
191
John C
John C@MeJohnC·
Got Qwen3.5-122B-A10B-UD-Q4_K_XL running on 4x AMD R9700 (128GB VRAM total) via patched vLLM/ROCm: - 262K context working - 262,000-token prefill: 437 tok/s - decode: 23.7 tok/s - roughly 4x 3090 / 4x 4090 class on decode - full system draw was just over 1000W, R9700's capped at 180W each.
stevibe@stevibe

Finally got my hands on the big one. Qwen3.5-122B-A10B — 122 billion parameters. Too big for any single consumer GPU. So I rented 4 of each... and then one professional card to see if brute force even matters. - 1x RTX PRO 6000 (96GB): 101.4 tok/s - 4x 5090 (128GB): 87.0 tok/s - 4x 4090 (96GB): 25.1 tok/s - 4x 3090 (96GB): 20.8 tok/s One single $8,500 card beat four RTX 5090s

English
3
3
22
6.4K
buffalu
buffalu@buffalu__·
@mert its pretty lit when it works. whats better?
English
15
0
14
3.8K
buffalu
buffalu@buffalu__·
openclaw is so buggy these days on coding. stops working mid-stream, background agents isn't working super well. anyone having better luck than me?
English
57
1
65
25K
ashpool
ashpool@4shpool·
Qwen3.5-122B writes blog comparing qwen3.5-122B Q6_K_XL to qwen3.5-397B UD-TQ1_0 using Hermes Agent (local) Key takeaways: - 122b Q6_K_XL crammed onto 6x RTX3090 - 262k context at Q8 - 46-34 tok/sec across full context range - excellent small-midsize MoE model can recommend 👍
ashpool tweet mediaashpool tweet media
English
3
1
12
5.5K
Ahmad
Ahmad@TheAhmadOsman·
@stevibe bro, using ollama with these GPUs is killing the performance you're leaving so much on the table
English
4
0
130
10.2K
stevibe
stevibe@stevibe·
Finally got my hands on the big one. Qwen3.5-122B-A10B — 122 billion parameters. Too big for any single consumer GPU. So I rented 4 of each... and then one professional card to see if brute force even matters. - 1x RTX PRO 6000 (96GB): 101.4 tok/s - 4x 5090 (128GB): 87.0 tok/s - 4x 4090 (96GB): 25.1 tok/s - 4x 3090 (96GB): 20.8 tok/s One single $8,500 card beat four RTX 5090s
English
58
84
694
186.8K
ashpool
ashpool@4shpool·
You need to be hermestermuxfromthecouchminimaxxing already
ashpool tweet media
English
0
1
4
309