Nicolas Pratt

25.3K posts

Nicolas Pratt banner
Nicolas Pratt

Nicolas Pratt

@nicolaspratt

Apple, IT, music, books, family, sports, cats

Richmond, VA Katılım Mart 2007
1.2K Takip Edilen624 Takipçiler
Nicolas Pratt retweetledi
Soleil
Soleil@soleiljolina·
Weirded out by homes with no pets. Like where is your creature
English
597
13.5K
112.2K
1.8M
Nicolas Pratt retweetledi
Champaign Showers
Champaign Showers@217Showers·
every millennial whose love of basketball is due in part to Deron Williams and Dee Brown is smiling ear to ear right now. Long time coming.
English
15
89
1.1K
19K
Nicolas Pratt retweetledi
Illinois Men's Basketball
ARE YOU WITH US? For the first time since 2005, we're heading to the FINAL FOUR.
Illinois Men's Basketball tweet media
English
260
3.7K
13.2K
602.5K
Nicolas Pratt retweetledi
Top Tier
Top Tier@TopTierState·
Illinois vs Iowa
Top Tier tweet mediaTop Tier tweet media
Polski
11
250
6.3K
336.2K
Nicolas Pratt retweetledi
Barstool Sports
Barstool Sports@barstoolsports·
“What did he whisper to you?” “I can’t tell you” “Come on, I won’t tell anyone” “No I can’t. Secret, secret” David Mirkovic is box office
English
86
437
11.4K
1.9M
Nicolas Pratt retweetledi
Jeremy
Jeremy@JeromeyR0me·
Nothing says “Eastern Europe” quite like Champaign-Urbana, Illinois
English
23
147
3.1K
77.5K
Marko Ilic
Marko Ilic@markoilico·
If you're now designing or redesigning a website, this will help you a lot. I recently curated the best hero sections, footers, social proof and other website parts because I got tired of having 15+ tabs open (even with Mobbin). Giving it away 100% free. Comment on this post, and I'll send a Figma link to your inbox!
GIF
English
4.9K
228
5.2K
398.7K
Nicolas Pratt retweetledi
Anish Moonka
Anish Moonka@anishmoonka·
Every time you message an AI chatbot, the model stores your entire conversation in temporary memory called a KV cache (a cheat sheet so it doesn’t re-read everything from scratch). On a large model like Llama 70B running a long conversation, that cache alone eats 40GB of GPU space, often more than the AI model itself. That’s half a $30,000 GPU chip consumed by one user’s memory. Google just published TurboQuant, a compression algorithm that shrinks this cache by 6x, down to just 3 bits per value, with zero accuracy loss across every benchmark tested. No retraining. No fine-tuning. Drop-in replacement. AI inference (running models for actual users, not training them) now makes up 55% of all AI compute spending. Hyperscalers are pouring nearly $700 billion into AI infrastructure in 2026. The KV cache is the single biggest memory bottleneck in that stack. When GPU cache memory fills up, the system can’t take more users. 6x compression means the same hardware handles roughly 6x more simultaneous conversations, or 6x longer context windows, or some mix of both. At cloud rates of $2-3/hour per H100 GPU, that’s the difference between profitable and unprofitable AI deployment. TurboQuant randomly rotates data to simplify its structure, applies a compressor, then adds a 1-bit error correction step to catch errors before they compound. On H100 GPUs it delivers up to 8x speedup over uncompressed computation. Google tested it across five long-context benchmarks on Llama, Gemma, and Mistral models. Perfect scores on needle-in-a-haystack (finding one specific fact buried in massive text). Being presented at ICLR 2026. It also outperforms existing methods for vector search, the technology that powers how search engines find similar results across billions of entries. Google runs billions of these searches daily. Three bits. Zero loss. 6x compression on the biggest memory bottleneck in a $700 billion infrastructure buildout.
Google Research@GoogleResearch

Introducing TurboQuant: Our new compression algorithm that reduces LLM key-value cache memory by at least 6x and delivers up to 8x speedup, all with zero accuracy loss, redefining AI efficiency. Read the blog to learn how it achieves these results: goo.gle/4bsq2qI

English
47
178
2.3K
365.5K
Nicolas Pratt retweetledi
shayla
shayla@callmeMaharani·
Just to be clear, we already have free data centers. They are called libraries.
English
78
7K
34.9K
262.1K
Nicolas Pratt retweetledi
Elvin
Elvin@elvin_not_11·
it's beautiful that I can traverse through 25 years of UI design history by clicking 3 times on Windows 11.
Elvin tweet media
English
276
2.4K
46.6K
1.1M
Nicolas Pratt retweetledi
The Uncultured Black AristoCAT
The Uncultured Black AristoCAT@camarawilliams·
You weren’t spending 5 hours at the airport under Biden….matter of fact….the airlines were required to pay you money if they messed up…..just food for thought.
English
368
13K
96K
1.5M
Nicolas Pratt retweetledi
Illinois Athletics
Illinois Athletics@IlliniAthletics·
The only one. No comparisons.
Illinois Athletics tweet media
English
29
428
2.6K
198.2K