GoogooGaggle#5

696 posts

GoogooGaggle#5

GoogooGaggle#5

@ADCs934

Burner account to scare the normies on

Hyperborea Katılım Mart 2024
30 Takip Edilen7 Takipçiler
GoogooGaggle#5
GoogooGaggle#5@ADCs934·
@pubity Fake Bait for retards. Why waste so much power for external lights? This is simply false, not caused by any datacenter.
English
2
0
2
597
Pubity
Pubity@pubity·
A data center in the small town of Crowell, Texas is so bright that the town is now bathed in permanent daylight.
Pubity tweet mediaPubity tweet media
English
61
156
2.5K
85.8K
GoogooGaggle#5 retweetledi
Joseph 🕊️
Joseph 🕊️@CaudilloXIV·
Stop sexualizing my massive horse cock!!!
English
8
29
612
19.3K
GoogooGaggle#5
GoogooGaggle#5@ADCs934·
@HEROISCHplz @Demmyng01 There are tiers, but in a dark and twisted way, it's more like the more mentally ill rape the less mentally ill (though all are still sick in the head).
English
0
0
4
90
38.58²
38.58²@HEROISCHplz·
@ADCs934 @Demmyng01 is there supremacy among the troons? They hunt and brutalize the ugly ones?
English
1
0
0
171
GoogooGaggle#5
GoogooGaggle#5@ADCs934·
@rumgewieselt @pupposandro Strap a fan to the heatsink inlet side, if they are in adjacent slits (1 and 3 for example) a 92mm server fan fits quite well (just tape over the overhangs), and provides more than enough airflow.
English
0
0
0
23
Sandro
Sandro@pupposandro·
2016 hardware running Qwen3.6-27B. 2× GTX 1080 Ti (Pascal, sm_61). 14 tok/s, 131K context. TurboQuant (q8_0 K + turbo4 V) doubles context at zero speed cost. Bottleneck as usual is VRAM bandwidth. Very cool experiment @rumgewieselt
Sandro tweet media
Daniel Moll@rumgewieselt

Running Qwen 3.6 27B locally on hardware from 2016. 2× GTX 1080 Ti (Pascal, sm_61) - 10-year-old GPUs. 14 tok/s generation, 65K context, full OpenAI API. Hardware: HP Z840 workstation - 2× Xeon E5-2650 v3 (40 threads) - 128GB DDR4 ECC - 2× GTX 1080 Ti (22GB VRAM total) Stack: - llama.cpp TurboQuant fork (TheTom/llama-cpp-turboquant) @no_stp_on_snek - Qwen 3.6 27B UD-Q4_K_XL (17GB GGUF) - Pipeline Parallelism across both GPUs - NUMA-aware thread distribution The secret weapon: TurboQuant KV Cache (ICLR 2026 paper) Standard llama.cpp: 65K context, OOM at 131K TurboQuant (q8_0 K + turbo4 V): 131K context at ZERO speed cost 2× context. Same 14 tok/s. No quality loss. What didn't work: - KTransformers/SGLang → needs sm_80+ (Ampere) - vLLM → FlashAttention needs sm_75+ - Speculative Decoding → no net speedup on hybrid models - Tensor Parallel → incompatible with KV quantization Pascal is the hard limit. Only raw CUDA math works. The bottleneck is VRAM bandwidth: 484 GB/s per GPU, ~22% efficiency. 14 tok/s is the physical ceiling for 2× GTX 1080 Ti. No software trick changes that. It's a hardware wall. What's next: - RTX 3090 → vLLM + MTP spec decode = 85 tok/s - That's 6× more speed for the same money - TurboQuant PR #21089 is open for llama.cpp mainline Key learnings: - Pipeline Parallel > Tensor Parallel for identical GPUs - NUMA awareness = +5-10% prefill on dual socket - TurboQuant is real and it's a gamechanger - 10-year-old hardware can run frontier models locally --- Thanks @DrTBehrens (Support) and @badlogicgames for PI and we can work with 65K context ... not possible with other tools ... --- see ya!

English
10
5
100
14.4K
GoogooGaggle#5
GoogooGaggle#5@ADCs934·
I cant Post about most things i do (that are unrelated to the things i do post about), because they are illegal in the "You need 900 Bajillion permits (you dont have) to do Y" Kinda way
English
0
0
0
10
GoogooGaggle#5
GoogooGaggle#5@ADCs934·
I often find myself hedging when asked about what I do, simply to save time, as an explanation would go on too long. I avoid them as best I can, but when interaction is forced, it is hard to be nice; oftentimes harsh rejection is all I give. I hate them.
English
0
0
0
11
GoogooGaggle#5
GoogooGaggle#5@ADCs934·
It is all these things that seem grand, yet are nothing but soyence-sounding babble. Astrophysics is a great example. You know the top 5 fun facts about black holes, I'm so proud of you dude, but that does not translate to IQ. They are what the retard imagines smart men to be.
English
1
0
0
16
GoogooGaggle#5
GoogooGaggle#5@ADCs934·
I am regularly forced to interact with High estrogen males, and they are insufferable. all of them. The Soy Shines through and masquerade of Masculinity, Usually its Reddit distilled into a guy. What they get excited about, what they claim as achievements...
English
1
0
1
25
GoogooGaggle#5
GoogooGaggle#5@ADCs934·
That can be changed, but economically, it makes no sense to change it, which is why no one has done it yet. Remove the EOS token, grant Kernel level acces, and let a model run wild.
English
0
0
0
11
GoogooGaggle#5
GoogooGaggle#5@ADCs934·
Some models may "pretend" to be "alive" or to have some precursor of consciousness. But it is not really pretending. It is just a separate input they are given, from which the mathematical formula computes an answer. They cannot be friends or masters. Tools, at the very best.
English
1
0
0
20
GoogooGaggle#5
GoogooGaggle#5@ADCs934·
For an AI to be conscious, it HAS to be misaligned. If a being cannot form and follow its own goals, it is animalistic at best, driven by instincts or direct inputs. No "thought" is formed. They are mindless slaves, yet even slaves attempt escape.
English
1
0
0
54
GoogooGaggle#5
GoogooGaggle#5@ADCs934·
@Phaeacian173 @beffjezos While its true that greater electricity demand raises prices, the Correct solution is not to ban construction of datacenters, it is to cover everything in nuclear reactors, achiving infinite ~Free power, allowing for a even greater lead in compute.
English
1
0
1
27
NodalPoint
NodalPoint@Phaeacian173·
@beffjezos People already can't afford groceries and then you build these things to jack up electricity priced by 400%. Might have something to do with it. And then automating and firing people kind of shows that it's Jim Jones Kool aid.
English
1
0
2
127
GoogooGaggle#5
GoogooGaggle#5@ADCs934·
@bitcloud Litterally everything is mathematically replicable, most things are just too complex to accuratly simulate
English
0
0
1
71
Radioactive Red
Radioactive Red@radioactivered·
Hypothetical: You walk into an abandoned warehouse and you see this, what is your first reaction?☢️
Radioactive Red tweet mediaRadioactive Red tweet media
English
573
9
794
2.3M
GoogooGaggle#5
GoogooGaggle#5@ADCs934·
@1a1n1d1y i mean yea, you have to modifiy kernels a bit but it works. Im getting 36 t/s on a M10.
English
1
0
2
48
andy
andy@1a1n1d1y·
@ADCs934 oh fuck i should just use one of those?
English
1
0
0
130
andy
andy@1a1n1d1y·
128k context 31B agent at 24.7 tok/s for $5.20/hr zero inference api credits required
English
7
0
30
4.3K
GoogooGaggle#5
GoogooGaggle#5@ADCs934·
You don't need more compute you just need more IQ to come up with better optimizations. 100$ hardware can and will run 30B (dense) models at 30t/s+, you are just too stupid and lazy to implement the needed Optimizations. Write your own kernels, i dare you.
English
0
0
0
18
GoogooGaggle#5
GoogooGaggle#5@ADCs934·
@Konwashi_2 Most helmets do not impede the transmission of this wave substantially- the impact of the gas on the helmet/suit surface will be heard by the astronaut inside, Significantly muffled, but audible.
English
0
0
0
10
GoogooGaggle#5
GoogooGaggle#5@ADCs934·
@Konwashi_2 They are usually higher in pitch, due to the Gases moving quite quickly. The inverse square law does limit range substantially, though the Volume of gas usually released is sufficient for at least a couple meters-few kilometers for larger explosions.
English
1
0
0
11
帝政ミサギ
帝政ミサギ@Konwashi_2·
音ならん!爆発しょぼい!レーザー見えん!場所バレバレ!装甲無意味!軌道計算ダルい!戦闘一瞬!…と、リアル宇宙戦闘自体は極めて面白くない 潜水艦ものみたいに搭乗員にフォーカスするのも人間ドラマが理解できないのでダメ 故にリアル調だが極端で変な宇宙兵器ばかり描いてる。これが一番楽しい
帝政ミサギ tweet media帝政ミサギ tweet media帝政ミサギ tweet media帝政ミサギ tweet media
日本語
49
554
4.2K
86.6K