TERMINATOR X
6.9K posts

TERMINATOR X
@scanamatics
Robot Supremacist


UFC structure going up on the White House South Lawn for June 14 fight



Cool, Ted. No one asked you, bro. Stop trying to undermine the President and his administration.







gtx 1080 8gb of vram launched may 2016. card turns ten this month. just ran three current open weight agentic models on one and the smallest of them fit 656,000 tokens of context at 38 tok/s gen speed. on a pascal arch card with no tensor cores. on 8gb of gddr5x that the discourse keeps telling me is unusable. three models, same hardware, same locked flags. qwen3 8b, qwen 3.5 9b, gemma 4 e4b. q4_k_m quant across the board. q4_0 kv cache, flash attention on, llama.cpp built for sm_61. one line setup. results vram ceiling on 8gb qwen3 8b, 78k qwen 3.5 9b, 248k gemma 4 e4b, 656k gen tok/s at small context qwen3 8b, 31.71 qwen 3.5 9b, 29.91 gemma 4 e4b, 42.13 gen tok/s at the ceiling qwen3 8b, 31.78 at 77k qwen 3.5 9b, 29.62 at 248k gemma 4 e4b, 38.73 at 648k agent workload combined throughput at 16k input qwen3 8b, 285.98 qwen 3.5 9b, 413.18 gemma 4 e4b, 543.63 gemma sweeps every category. 2.6x more context than qwen 3.5 9b, 8.4x more than qwen3 8b, 30% faster at the ceiling. sliding window attention keeps the kv cache nearly flat as context grows, which is why 8gb stretches an order of magnitude further on gemma than on a vanilla transformer. the part that gets me is qwen3 8b losing to qwen 3.5 9b at anything past 4k context. newer release, but heavier kv per token, less aggressive gqa, every release has tradeoffs and pascal exposes them by giving the architecture nowhere to hide. q4_0 kv cache is the practical unlock. flash attention on pascal still works in 2026, no special path needed. sm_61 compiles clean in llama.cpp. that's the entire stack. a card you literally might have in a drawer can run a coding agent with 600k+ tokens of context. raw perf is one axis. next drop is the other one. agentic coding on the same hardware. single file canvas demos, then multi file refactors. can these models finish a task without rails or do they fall apart the moment the agent loop gets deep. stay tuned. you might have this card in a drawer.













Massie: I would have come out sooner but I had to call my opponent to concede and it took a while to find him in Tel Aviv




The media is freaking out about us right now. Why? Because we’re on the verge of getting almost all medical mandates BANNED in multiple states right now. Including vaccines. Leslie Manookian received a “deluge of texts, emails, and phone calls from major media” after she announced our Medical Freedom Act Coalition. “It means that we’re right over the target.” She led the successful charge to pass Idaho’s Medical Freedom Act in 2025. Now, we’re bringing that victory nationwide. Arizona’s Medical Freedom Act just passed the state House and Senate. And at least a dozen more states have introduced Medical Freedom Acts. Health freedom is sweeping America, and the Big Pharma-owner media is freaking out. @LeslieManookian @WestonAPrice

Massie’s Race Matters!









