Safwan Shaheer

338 posts

Safwan Shaheer

@devorein

Admirer of human intelligence, open source and hard money Building: @undeniablehealth Prev: @CharmVerse, @scoutgamexyz, @madscriptai & @pitchmebro

Beigetreten Ekim 2020

459 Folgt69 Follower

Safwan Shaheer retweetet

Tom Turney@no_stp_on_snek·23h

Google dropped the TurboQuant paper yesterday morning. 36 hours later it's running in llama.cpp on Apple Silicon, faster than the baseline it replaces. the numbers: - 4.6x KV cache compression - 102% of q8_0 speed (yes, faster, smaller cache = less memory bandwidth) - PPL within 1.3% of baseline (verified, not vibes) the optimization journey: 739 > starting point (fp32 rotation) 1074 > fp16 WHT 1411 > half4 vectorized butterfly 2095 > graph-side rotation (the big one) 2747 > block-32 + graph WHT. faster than q8_0. 3.72x speedup in one day. from a paper I read at dinner last night. what I learned along the way: - the paper's QJL residual stage is unnecessary. multiple implementations confirmed this independently - Metal silently falls back to CPU if you mess up shader includes. cost me hours - "coherent text" output means nothing. I shipped PPL 165 thinking it worked. always run perplexity - ggml stores column-major. C arrays are row-major. this will ruin your afternoon everything is open source. the code, the benchmarks, the speed investigation logs, the debugging pain, all of it. github.com/TheTom/turboqu… paper to parity in 36 hours. what a time to be alive.

English

131

1.6K

81.4K

Safwan Shaheer retweetet

Windscribe@windscribecom·6d

Show your ID to protect kids Show your ID to protect kid Show your ID to protect ki Show your ID to protect k Show your ID to protect Show your ID to protec Show your ID to prote Show your ID to prot Show your ID to pro Show your ID to pr Show your ID to p Show your ID to Show your ID t Show your ID Show your I Show your Show you Show yo Show y Show Sho Sh S Su Suc Suck Suck m Suck my Suck my b Suck my ba Suck my bal Suck my ball Suck my balls Suck my balls P Suck my balls Pa Suck my balls Pal Suck my balls Pala Suck my balls Palan Suck my balls Palant Suck my balls Palanti Suck my balls Palantir

English

249

3.9K

47.9K

2.2M

Safwan Shaheer retweetet

bubble boi@bubbleboi·22h

Noooooo your telling me that the Google Geniuses were able to compress KV Cach without losing quality by *checks notes* using polar coordinates ?! It was just.. *gasp* simple trigonometry? And wait all four of them are Iranian? Two out of four from Sharif University ? 😂😂😂

English

217

4.6K

219.6K

Safwan Shaheer retweetet

snwy@snwy_me·21h

what the actual fuck is he talking about

Shubham Saboo@Saboo_Shubham_

“OpenClaw is the iPhone of tokens” — Nvidia CEO on Lex Podcast

English

253

208

9.4K

484.6K

Safwan Shaheer retweetet

Noah Kagan@noahkagan·1d

Hot take: OpenClaw acquisition will go down as one of the worst acquisitions of all time. It’s insanely buggy and Claude Code can do nearly 80% of functionality without constant maintenance.

English

323

1.6K

149.6K

Safwan Shaheer retweetet

nixCraft 🐧@nixcraft·1d

Tim Cook must me laughing right now as he avoided spending on LLM and just keep selling his iPhones and computers and made real profit. Meanwhile, the AI batshit crazy Microslop is 37% down from the ATH. OpenAI running out of money and shutting down Sora app. Lmao

English

553

9.7K

290.2K

Safwan Shaheer retweetet

Hugging Models@HuggingModels·1d

Meet a reasoning powerhouse: Qwen3.5-9B distilled with Claude 4.6 Opus reasoning! This GGUF model brings elite chain-of-thought capabilities to a compact 9B parameter package. Perfect for developers wanting reasoning smarts without massive compute.

English

790

45.3K

Safwan Shaheer retweetet

Hugging Models@HuggingModels·1d

Meet a game-changer in AI: Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled. This model can understand BOTH images AND text, then generate thoughtful responses. It's like giving your AI reasoning superpowers. The community is buzzing about its potential!

English

691

41.4K

Safwan Shaheer retweetet

Wise@trikcode·1d

the generation that refused to accept cookies. is now giving AI access to their desktops, files, and bank accounts.

English

307

1.8K

13.8K

260.3K

Safwan Shaheer retweetet

Advait Paliwal@advaitpaliwal·1d

I built Feynman, Claude Code for research. I gave it a question and it came back 30 minutes later with a cited meta analysis. It can also replicate experiments on Runpod, audit claims against code, and simulate peer review. Open source & MIT license, link below

English

111

347

244.2K

Safwan Shaheer retweetet

ngrok@ngrokHQ·1d

Quantization can make an LLM 4x smaller and 2x faster, with barely any quality loss. But what *is* it? @samwhoo crafted a beautiful interactive essay explaining it from first principles, aimed at coders, not mathematicians. ngrok.com/blog/quantizat…

English

153

1.3K

497.3K

Safwan Shaheer retweetet

Andi Marafioti@andimarafioti·3d

OpenAI's latest repo has an interesting 3rd top contributor.

English

2.6K

187.1K

Safwan Shaheer retweetet

Gergely Orosz@GergelyOrosz·2d

Congrats on having fun and building a vibe coded WYSIWYG editor Patiently waiting for when they’ll realize things like permissions, tagging, backups+disaster recovery, search, exports, tables, mobile app, integrations w Slack+Linear+others need to be built… 🍿

English

172

1.5K

188.5K

Safwan Shaheer retweetet

SUN YOUNG HWANG ᯅ 🇰🇷@SOSOHAJALAB·2d

Guys.. this model is just crazy. If you have just less than 48gb vram, just try the 8q gguf format. Feels just like opus! Tool calling is working smoothly!! Appreciate for this! (Hf and qwen!!) huggingface.co/Jackrong/Qwen3…

English

235

2.8K

184.6K

Safwan Shaheer retweetet