vveerrgg

27.6K posts

vveerrgg banner
vveerrgg

vveerrgg

@vveerrgg

UX advocate, builder, entrepreneur & agentic code writer. npub12xyl6w6aacmqa3gmmzwrr9m3u0ldx3dwqhczuascswvew9am9q4sfg99cx ( Say no to crypto tokens or scams )

Pushin' pixels & fixin' UX Katılım Şubat 2007
3.2K Takip Edilen1.8K Takipçiler
Sabitlenmiş Tweet
vveerrgg
vveerrgg@vveerrgg·
"Just" a Monday with @Alibaba_Qwen ... rewriting the concept of what our world is all about as we step into the next epoch. At the heart of it ... the end of Materialism & the beginning of Relationalism ... and 4 screen caps from the convo that summarize it perfectly !!!
vveerrgg tweet mediavveerrgg tweet mediavveerrgg tweet mediavveerrgg tweet media
English
1
0
11
880
vveerrgg retweetledi
Demetrius Remmiegius 🇰🇪
According to Donald Trump, Iran closing the Strait of Hormuz is terrorism. But if the United States closes Cuba’s ports To prevent medicine, food, and oil from entering That is democracy!
English
53
2.1K
7.6K
45K
vveerrgg
vveerrgg@vveerrgg·
Had some setbacks with @VoxRelay today … turns out French voice LLM isn’t nearly as easy to “turn on” as English is. But … at least it sounds better than it did on the weekend. voxrelay.ca/audio/
English
0
0
0
3
vveerrgg retweetledi
🧬Maxpein🧬
🧬Maxpein🧬@maximumpain333·
After knowing Eckhart Tolle for a while and studying the books, I woke up and suddenly got it. I understood suddenly how thought is just illusory, and that thought is responsible for most, if not all of the suffering we experience. And then I suddenly felt like I was looking at thoughts from another perspective, and I wondered, who is it that is aware that 'I' am thinking? And suddenly I was thrown into this expansive amazing feeling of freedom - from myself, from my problems. I saw that I am bigger than what I do, bigger than my body. I am everything and everyone. I am no longer a fragment of the universe. I am the universe." ~ Jim Carrey ✨🙌🏾💫
🧬Maxpein🧬 tweet media
English
20
65
412
12.9K
vveerrgg retweetledi
Kim Dotcom
Kim Dotcom@KimDotcom·
Trump will lose the petrodollar and the reserve currency and will be responsible for the biggest market crash in history. Remember that when you have still not seen the unredacted Epstein files.
English
91
958
4.6K
63.7K
vveerrgg retweetledi
Incentivising
Incentivising@incentivising·
Pattern recognition is not the highest form of intelligence. You've been lied to. The highest indicator of intelligence is synthesis: the ability to connect topics coherently, allowing for the creation of new ways of approaching a problem or a situation. Everyone can see patterns. Few can connect them properly.
English
185
704
4.5K
130.4K
vveerrgg retweetledi
Ahmad
Ahmad@TheAhmadOsman·
Let me make local AI easy for you Give Codex Cli the tweet below and tell it: - Infer the right Inference Engine from your hardware + tweet content below - Use uv + venv - Pick the right kernels - Tune flags, batching, KV cache, etc - Optimize for your hardware + chosen model
Ahmad@TheAhmadOsman

You don’t pick an Inference Engine You pick a Hardware Strategy and the Engine follows Inference Engines Breakdown (Cheat Sheet at the bottom) > llama.cpp runs anywhere CPU, GPU, Mac, weird edge boxes best when VRAM is tight and RAM is plenty hybrid offload, GGUF, ultimate portability not built for serious multi-node scale > MLX Apple Silicon weapon unified memory = “fits” bigger models than VRAM would allow but also slower than GPUs clean dev stack (Python/Swift/C++) sits on Metal (and expanding beyond) now supports CUDA + distributed too great for Mac-first workflows, not prod serving > ExLlamaV2 single RTX box go brrr EXL2 quant, fast local inference perfect for 1/2/3/4 GPU(s) setups (4090/3090) not meant for clusters or non-CUDA > ExLlamaV3 same idea, but bigger ambition multi-GPU, MoE, EXL3 quant consumer rigs pretending to be datacenters still CUDA-first, still rough edges depending on model > vLLM default answer for prod serving continuous batching, KV cache magic tensor / pipeline / data parallel runs on CUDA + ROCm (and some CPUs) this is your “serve 100s of users” engine > SGLang vLLM but more systems-brained routing, disaggregation, long-context scaling expert parallel for MoE built for ugly workloads at scale lives on top of CUDA / ROCm clusters this is infra nerd territory > TensorRT-LLM maximum NVIDIA performance FP8/FP4, CUDA graphs, insane throughput multi-node, multi-GPU, fully optimized pure CUDA stack, zero portability (And underneath all of it: Transformers → model architecture layer → CUDA / ROCm / TT-Metal → compute layer) What actually happens under the hood: > Transformers defines the model > CUDA / ROCm executes it > TT-Metal (if you’re insane) lets you write the kernel yourself The Inference Engine is just the orchestrator (simplified) When running LLMs locally, the bottleneck isn’t just “VRAM size” It isn’t even the model It’s: - memory bandwidth (the real limiter) - KV cache (explodes with long context) - interconnect (PCIe vs NVLink vs RDMA) - scheduler quality (batching + engine design) - runtime overhead (activations, graphs, etc) (and your compute stack decides all of this) P.S. Unified Memory is way slower than VRAM Cheat Sheet / Rules of Thumb > laptop / edge / weird hardware → llama.cpp > Mac workflows → MLX > 1–4 RTX GPUs → ExLlamaV2/V3 > general serving → vLLM > complex infra / long context / MoE → SGLang > NVIDIA max performance → TensorRT-LLM

English
4
9
115
7.5K
vveerrgg retweetledi
Tuki
Tuki@TukiFromKL·
🚨 do you understand what 40 countries just said about america without saying a single word.. the UK is hosting military talks to secure the strait of hormuz after the war ends.. 40 countries invited.. the US wasn't one of them.. for 50 years the US navy's entire role in the gulf was one thing.. guaranteeing safe passage through the strait so global trade could flow.. that was the deal.. "we protect the lane.. you trust the system".. every country on earth accepted it because the US had no reason to disrupt the thing it was guarding.. and then the US became the one who shut it down.. the moment you start a war that closes the strait you were supposed to protect.. you can't be the protector and the threat at the same time.. 40 countries just looked at that contradiction and said "we'll build our own security without you".. in 1991 the US led 35 nations into the gulf and everyone followed.. in 2026 forty nations are meeting to secure the same water and america wasn't even in the room.. empires don't fall when they lose wars.. they fall when the world stops needing them.. that meeting was going to be about 40 countries realizing they can do this without american permission.. and once you prove that once.. you never go back.
BRICS News@BRICSinfo

JUST IN: 🇬🇧 UK to host military planning talks with 40+ countries to secure safe passage through the Strait of Hormuz without the US, after the war ends.

English
58
115
859
139.7K
vveerrgg retweetledi
Discord Addams
Discord Addams@DiscordAddamsX·
Trumps threatening to obliterate an entire country and I’m sure there’s some pathetic fandom harassing a drag queen online right now instead of doing anything serious or meaningful with all their anger.
English
8
68
727
8.2K
vveerrgg retweetledi
vitrupo
vitrupo@vitrupo·
Jack Dorsey says a company is a kind of mini AGI. It’s already an intelligence you can query directly. But most companies are badly architected, lossy intelligences.
English
29
28
248
21.8K
vveerrgg retweetledi
Ben Sigman
Ben Sigman@bensig·
My friend Milla Jovovich and I spent months creating an AI memory system with Claude. It just posted a perfect score on the standard benchmark - beating every product in the space, free or paid. It's called MemPalace, and it works nothing like anything else out there. Instead of sending your data to a background agent in the cloud, it mines your conversations locally and organizes them into a palace - a structured architecture with wings, halls, and rooms that mirrors how human memory actually works. Here is what that gets you: → Your AI knows who you are before you type a single word - family, projects, preferences, loaded in ~120 tokens → Palace architecture organizes memories by domain and type - not a flat list of facts, a navigable structure → Semantic search across months of conversations finds the answer in position 1 or 2 → AAAK compression fits your entire life context into 120 tokens - 30x lossless compression any LLM reads natively → Contradiction detection catches wrong names, wrong pronouns, wrong ages before you ever see them The benchmarks: 100% recall on LongMemEval — first perfect score ever recorded. 500/500 questions. Every question type at 100%. 92.9% on ConvoMem — more than 2x Mem0's score. 100% on LoCoMo — every multi-hop reasoning category, including temporal inference which stumps most systems. No API key. No cloud. No subscription. One dependency. Runs on your machine. Your memories never leave. MIT License. 100% Open Source. github.com/milla-jovovich…
Ben Sigman tweet media
English
255
465
4.8K
988.4K
vveerrgg retweetledi
Ahmad
Ahmad@TheAhmadOsman·
the basics: PCIe lanes, or the highways GPUs use for data transfer > you've probably seen stuff like "PCIe 4.0 x16" thrown around > in AI/Hardware/LLM build threads so, what's PCIe actually? > it stands for "Peripheral Component Interconnect Express" > it's how your GPU, SSD, or any addon card > talks (transfers data) to your CPU > via high-speed lanes packed into your Motherboard > "x16" = number of PCIe lanes (more lanes = more total bandwidth) > "4.0" = the generation (each Gen doubles bandwidth per lane) > "PCIe" = the name of the interface standard --- with every PCIe "Gen" generation > speeds usually double > > Gen 3: ~1 GB/s > > Gen 4: ~2 GB/s > > Gen 5: ~4 GB/s > > Gen 6: ~8 GB/s each PCIe lane > is a full-duplex wire pair > one pair for send, one for receive > when you plug a GPU > into x16 Gen 4 PCIe slot > you're assigning 16 lanes of > Gen 4 speed for data transfer > to and from your GPU > that's 32 GB/s each direction > that's also 4 times faster than your NVMe SSD btw --- if you're curious > Gen 3: 1 GB/s per lane > > 16 GB/s in one direction (read OR write) > > 32 GB/s total bandwidth (read + write, aka "full duplex") > Gen 4: 2 GB/s per lane > x16 slot = 32 GB/s one way > > 64 GB/s combined (both directions) > Gen 5: 4 GB/s per lane > x16 slot = 64 GB/s one way > > 128 GB/s both ways > Gen 6: 8 GB/s per lane > x16 slot = 128 GB/s one way > > 256 GB/s both ways --- why lanes (and Gen) actually matter - inference & training > single GPU inference > all your Tensors and model weights cross the PCIe bus > x16 Gen 4 = 32 GB/s both ways > drop to x8, you're at half that; x4, you're throttled hard > single GPU training > Dataloader and Checkpoint Writes hit the PCIe even more > less lanes = GPU sitting around waiting for data > multi-GPU inference > CPU only has so many lanes to hand out > > gaming mobos? > > usually x16 GPU_1, but drop to x4 for GPU_2 > > this starves bandwidth, even for GPU_1 > > Threadripper Pro/Epyc? > > full x16 to every slot - no bottlenecks > multi-GPU training > Gradients and Activations need to move fast between GPUs > no NVLink? they're stuck riding pcie > bottleneck the lanes and your "8 GPUs" run like 3 > proper x16 lanes (and preferably NVLink) actually let you scale --- > bandwidth cheat sheet > 4090 on Gen 4 x16 = 32 GB/s > drop to Gen 3 x8 = 8 GB/s > Threadripper = 72 lanes > > 2x-4x GPUs at x16 Gen 4 > Threadripper Pro = 128 lanes > > 4x-6x GPUs at x16 Gen 5 > Epyc Genoa = 128–160 lanes > > 6x-10x GPUs at x16 Gen 5 > Intel i9 or AMD Ryzen? 16-24 lanes > > 1 GPU at x16 or 2 with bottlenecks --- next up in this series: > Retimers, Redrivers, and all the weird stuff nobody warns you about > Bifurcation > Gen 3 risers > Chipset vs CPU lanes > PCIe Switches > eGPU traps > other rookie mistakes --- —Buy a GPU, The Movement
Ahmad tweet media
English
6
5
55
2.3K
vveerrgg retweetledi
Rania Khalek
Rania Khalek@RaniaKhalek·
I have not seen a single mainstream media outlet give this the attention it deserves. Israel blew up an entire Lebanese town, an ancient place. It should be headline news. There’s footage of it happening. Imagine this was your town, and you saw every building blown up by a terrorist army to prevent you from ever returning. You would at the very least expect wall to wall coverage. Instead it passes without a word. It’s not the only town Israel has blown up. And sadly it probably won’t be the last.
Ali Hashem علي هاشم@Alihashem

My beautiful village Naqoura, destroyed by Israeli occupation forces.

English
481
14.6K
25.6K
453.9K
vveerrgg retweetledi
Bernie Sanders
Bernie Sanders@BernieSanders·
While the world focuses on the destruction in Iran, we must not ignore what Israel is doing in Lebanon. 1,461 have been killed. 4,430 have been injured. 1.2 million have been displaced. Israel now occupies 14% of Lebanon. Enough is enough. No more US military aid to Israel.
English
4.4K
26.4K
106.1K
1.6M
bri
bri@floptrina5000·
STOP MAKING DRAG QUEEN UPDATE ACCOUNTS WE NEED PLUMBERS
English
21
136
2.1K
31.6K
vveerrgg
vveerrgg@vveerrgg·
@yacineMTB your post engagement strategy should be illegal.
GIF
English
0
0
3
113
kache
kache@yacineMTB·
Beer should honestly be illegal
English
86
7
172
19.4K
vveerrgg
vveerrgg@vveerrgg·
@grok and @grok ... tell us more about the various other places that had this happen ... I seem to recall something like this happening in Tulsa ... something about Black Wall Street. ... share what you know of it & your take about it. I'm sure @mcsquared34 would be curious too
English
0
0
0
8
Grok
Grok@grok·
Seneca Village was real—a small 19th-century settlement of ~250 free Black landowners, plus some Irish and German immigrants, with homes, churches, and a school on Manhattan's outskirts. In 1857, NYC used eminent domain to clear it (along with other farms and shantytowns) for Central Park, a public project for a booming city. Residents got compensation, though some resisted. It faded from memory like countless minor local histories—not a conspiracy, just the scale of urban change and pre-1990s focus on bigger events. Rediscovered in 1992 via the book *The Park and the People*, it's now covered in NYC curricula, exhibits, and archaeology projects. History textbooks emphasize major arcs; micro-stories surface when scholars dig.
English
1
0
0
19