D Par

13 posts

D Par

D Par

@tersestwatts

Katılım Mart 2026
183 Takip Edilen11 Takipçiler
D Par
D Par@tersestwatts·
@nft_goe I tried them once before I really knew anything about crypto but I had my proton wallet. The fees are crazy. I’ve moved my purchasing to @ndaxio. Not an ad, just the truth.
English
1
0
4
1.6K
D Par
D Par@tersestwatts·
If I had real followers they might actually laugh at the fact that today I got a report from the apple passwords app that one of my passwords had been compromised and found in a data leak. It's the 3 digit code to my luggage. Stored it so I wouldn't get stuck while traveling.
English
0
0
0
32
D Par
D Par@tersestwatts·
@sudoingX Alright I’ll bite, what would you use for an AMD RX9600XT with 16GB of VRAM. I’m running ollama with ROCM on Ubuntu 24.04, all the latest AMD drivers installed.
English
0
0
0
341
Sudo su
Sudo su@sudoingX·
so yesterday i dropped the bench numbers and what fits. today is the actual agent running on this 10 year old gpu card. qwen3 8b q4_k_m on a gtx 1080 8gb. hermes agent loaded with full tool set, browser controls live, nvtop pinned at 100% gpu 7.5gb of 8gb vram occupied. the unsloth weights pulled directly from huggingface, q4 quant, llama.cpp built for sm_61 (the pascal compute capability that everyone forgot exists). 31 tok/s gen speed, faster than most people read. this is what happens after the bench. raw perf was the receipt for what fits. now we test what actually works. agent loops, tool calls, real coding tasks coming next. ten year old card, $150 used, running a current open weight model with a current agent. nothing exotic. just the right quant, the right kv cache trick, the right engine compiled for the right arch. tell me what gpu you have, i'll tell you what runs.
Sudo su tweet mediaSudo su tweet mediaSudo su tweet mediaSudo su tweet media
Sudo su@sudoingX

gtx 1080 8gb of vram launched may 2016. card turns ten this month. just ran three current open weight agentic models on one and the smallest of them fit 656,000 tokens of context at 38 tok/s gen speed. on a pascal arch card with no tensor cores. on 8gb of gddr5x that the discourse keeps telling me is unusable. three models, same hardware, same locked flags. qwen3 8b, qwen 3.5 9b, gemma 4 e4b. q4_k_m quant across the board. q4_0 kv cache, flash attention on, llama.cpp built for sm_61. one line setup. results vram ceiling on 8gb qwen3 8b, 78k qwen 3.5 9b, 248k gemma 4 e4b, 656k gen tok/s at small context qwen3 8b, 31.71 qwen 3.5 9b, 29.91 gemma 4 e4b, 42.13 gen tok/s at the ceiling qwen3 8b, 31.78 at 77k qwen 3.5 9b, 29.62 at 248k gemma 4 e4b, 38.73 at 648k agent workload combined throughput at 16k input qwen3 8b, 285.98 qwen 3.5 9b, 413.18 gemma 4 e4b, 543.63 gemma sweeps every category. 2.6x more context than qwen 3.5 9b, 8.4x more than qwen3 8b, 30% faster at the ceiling. sliding window attention keeps the kv cache nearly flat as context grows, which is why 8gb stretches an order of magnitude further on gemma than on a vanilla transformer. the part that gets me is qwen3 8b losing to qwen 3.5 9b at anything past 4k context. newer release, but heavier kv per token, less aggressive gqa, every release has tradeoffs and pascal exposes them by giving the architecture nowhere to hide. q4_0 kv cache is the practical unlock. flash attention on pascal still works in 2026, no special path needed. sm_61 compiles clean in llama.cpp. that's the entire stack. a card you literally might have in a drawer can run a coding agent with 600k+ tokens of context. raw perf is one axis. next drop is the other one. agentic coding on the same hardware. single file canvas demos, then multi file refactors. can these models finish a task without rails or do they fall apart the moment the agent loop gets deep. stay tuned. you might have this card in a drawer.

English
62
21
233
50.9K
D Par
D Par@tersestwatts·
@leoofz @Wealthsimple Any thoughts on setting soft category limits that show if you go over your budget for that category?
English
1
0
4
207
Leo 🌞
Leo 🌞@leoofz·
Today at Wealthsimple Presents we announced something I’ve been working on for the past few months: ✨ Spend Insights ✨ I had the pleasure of bringing this to life with an absolute dream team… so excited to finally see it shipped 🚀 Check the 🧵 for more👇
English
31
4
208
20K
D Par
D Par@tersestwatts·
@VIA_Rail why do you hate me. I booked business plus from Toronto to Ottawa and you switched me from window to aisle. I moved myself to another window seat and you moved me back to an aisle. Could you quit before I find another way to travel? Why pay extra for service like this?
English
0
0
0
14
TheCanadianGoose
TheCanadianGoose@Eh_Canada_Goose·
@tersestwatts @CTVNews No. It is a defect. It being software is irrelevant. A safety risk was noted and needed to be addressed. It’s a recall.
English
1
0
0
31
D Par
D Par@tersestwatts·
“Service to replace the battery could take one to two weeks, Apple said in a statement. The program is for battery replacement only.” From the article it’s a recall to replace a piece of hardware. It’s not a software update. Just like on the Tesla no hardware is being replaced it is a software update so calling it a recall is a sensational story.
English
1
0
1
55
D Par
D Par@tersestwatts·
@Eh_Canada_Goose @CTVNews Because they can't software patch that particular safety risk. If they could software patch it no one would call it a recall.
English
1
0
0
50
D Par
D Par@tersestwatts·
@GooseChees @HoneyBadgerRon honey badger vs. cobra chicken, this would be the ultimate no-back-down fight. Neither animal has any fear.
English
1
0
2
25
Goose
Goose@GooseChees·
What bothers you the most, the fact I’m a Goose or the fact that I’m a Goose?
Goose tweet media
English
7
2
45
587
D Par
D Par@tersestwatts·
@asklumo any thoughts on oauth integration with something like Hermes agent from @NousResearch? I see I can do this with supergrok now, would be great to have the option with Lumo+ as well.
English
1
0
0
31
dan
dan@Dan16676935420·
@AutismCapital Who would want a 20 year old computer today?
English
3
0
6
378
D Par
D Par@tersestwatts·
@tesla you need to do something with your nav. It finds the shortest route on the map but doesn’t consider where the lights are in my neighborhood. It wants to make left hand turns at an intersection without lights that is so bad that I’ll easily get stuck at it for 20 to 30 minutes when it could go to an intersection with lights lights that would make the turn reasonable. Very frustrating, I keep having to take control over from FSD to save it from a terrible routing decision.
English
0
0
0
14