Faisal Mumtaz

3.9K posts

Faisal Mumtaz

@digitacy

Innovating and building crazy things @HUMAIN

Saudi Arabia Entrou em Şubat 2019

709 Seguindo10K Seguidores

Tweet fixado

Faisal Mumtaz@digitacy·2d

I wrote an LLM inference engine from scratch in Rust. No PyTorch. No ONNX. No Python. Just one binary that downloads a model and runs GPU-native inference on Apple Silicon + NVIDIA. It beats llama.cpp at Q4 on Apple Silicon. Here’s how it works 🧵

English

385

983.6K

Faisal Mumtaz@digitacy·2d

Built solo to understand GPU inference end-to-end. I also document where it loses today (e.g. CUDA prefill trade-offs) so benchmarks stay honest. If this is useful, star it: github.com/faisalmumtaz89… #RustLang #LLM #LocalLLaMA

English

528

Faisal Mumtaz@digitacy·2d

OpenAI- and Anthropic-compatible server in one command. SSE streaming, tool calling, native tokenizer, and native format in a single Rust codebase.

English

599

Faisal Mumtaz@digitacy·2d

English

385

983.6K

Faisal Mumtaz@digitacy·1 Haz

@cryptogoos Just starlink will be worth that in 10 years so its a steal

English

576

CryptoGoos@cryptogoos·31 May

CRAZY: 🇺🇸 SpaceX is going public at ~$1.8 trillion. Revenue last year: $18.7 billion. Net loss: ~$4.9 billion. That's a price-to-sales multiple near 96x. On a company currently growing revenue around 33% a year, and still losing money. To grow into that price, it has to sustain 40%+ revenue growth for 10 years straight. Almost impossible.

English

201

263

1.3K

147.2K

Faisal Mumtaz@digitacy·28 May

😂

Cybertruck@cybertruck

Hey Luce

ART

Faisal Mumtaz@digitacy·19 May

@helloitsaustin People who typically fit the higher echelons such as distinguished fellows, don't get interviewed in traditional ways. They decide if the company and its mission is worth their time.

English

2.9K

austin lau@helloitsaustin·19 May

do you think he had to do the coding interview or does he get a pass?

Andrej Karpathy@karpathy

Personal update: I've joined Anthropic. I think the next few years at the frontier of LLMs will be especially formative. I am very excited to join the team here and get back to R&D. I remain deeply passionate about education and plan to resume my work on it in time.

English

227

6.1K

509.6K

Faisal Mumtaz@digitacy·19 May

@karpathy @farzyness Oh damn. OpenAI is cooked

English

Andrej Karpathy@karpathy·19 May

English

11.2K

150.2K

27.6M

Faisal Mumtaz retweetou

Crémieux@cremieuxrecueil·19 May

Last year, China added as much energy to its grid as Germany has in total.

English

689

1.6K

5.6K

1.1M

Faisal Mumtaz@digitacy·9 May

Another great piece from @JayaGup10

Jaya Gupta@JayaGup10

x.com/i/article/2052…

English

4.6K

Faisal Mumtaz retweetou

Balaji@balajis·8 May

The US federal budget is $7 trillion. There are 535 in the Senate and Congress. They collectively allocate that money. Specifically: 7000/535 = 13 billion per official. Which means AOC is a political billionaire. She allocates far more than any market billionaire. Indeed she allocates ten billion liquid, per year. A market billionaire has one billion, illiquid, per life. So: AOC spends 10-100X what a market billionaire has. It's not even close. She's right about one thing, though. AOC didn't earn the tens of billions she spends. The state taxed that money, by force.

Breitbart News@BreitbartNews

Alexandria Ocasio-Cortez: You can't earn a billion dollars. Ilana Glazer: That's right. AOC: You just can't earn that. Glazer: That's exactly correct. AOC: You can get market power. You can break rules. You can do all sorts of things. You can abuse labor laws. Glazer: Yup. AOC: You can pay people less than what they're worth. Glazer: Yup. AOC: But you can't earn that, right? Glazer: That's right. AOC: And so you have to create a myth that -- since you didn't earn that, you have to create a myth of earning it.

English

214

622

5.5K

387K

Faisal Mumtaz@digitacy·6 May

@alex_whedon If you're going to make such claims, start with empirical proof, not marketing

English

Alexander Whedon@alex_whedon·5 May

Introducing SubQ - a major breakthrough in LLM intelligence. It is the first model built on a fully sub-quadratic sparse-attention architecture (SSA), And the first frontier model with a 12 million token context window which is: - 52x faster than FlashAttention at 1MM tokens - Less than 5% the cost of Opus Transformer-based LLMs waste compute by processing every possible relationship between words (standard attention). Only a small fraction actually matter. @subquadratic finds and focuses only on the ones that do. That's nearly 1,000x less compute and a new way for LLMs to scale.

English

1.5K

2.9K

23K

12.8M

Faisal Mumtaz@digitacy·23 Nis

@UrataSyndicate @damianplayer Exactly, he literally listed every single option. How is this a revelation?

English

696

urata ☄️@UrataSyndicate·22 Nis

@damianplayer What else would you even do? What else would the expectation even be? This note feels like it is saying nothing

English

201

33.2K

Damian Player@damianplayer·22 Nis

this is why Elon’s companies move 10x faster than most. every founder should run their team like this: push back, ask, or execute.

English

181

1.1K

18.4K

964.1K

Faisal Mumtaz@digitacy·23 Nis

@zan2434 @eddiejiao_obj @drewocarr Dude what!?

English

135

Zain Shah@zan2434·22 Nis

Imagine every pixel on your screen, streamed live directly from a model. No HTML, no layout engine, no code. Just exactly what you want to see. @eddiejiao_obj, @drewocarr and I built a prototype to see how this could actually work, and set out to make it real. We're calling it Flipbook. (1/5)

English

1.1K

3.7K

28.8K

Faisal Mumtaz@digitacy·15 Nis

@hthieblot No

Hubert Thieblot@hthieblot·15 Nis

pitch me your company in 1 word.

English

2.4K

1.1K

288.9K

Faisal Mumtaz@digitacy·15 Nis

@stats_feed @grok Compare SpaceX launches vs rest of the world, 2025 and 2026

English

1.7K

World of Statistics@stats_feed·15 Nis

🚀 Successful orbital rocket launches in 2026 🇺🇸 United States: 56 Rest of the world: 26 Rest of the world: 🇨🇳 China: 20 🇷🇺 Russia: 5 🇫🇷 France: 1

English

103

2.4K

116.4K

Faisal Mumtaz@digitacy·31 Mar

I will give this a shot on Lumen, Inference engine I published recently, however I am highly skeptical. HBM bandwidth is 3-4 TB/s, while Fast PCIe 5.0 NVMe SSD's are 5-10 GB/s. There's a reason why Apple doesn't market SSD performance. Notebookcheck’s 1TB M4 Max review unit measured about 5.06 GB/s read and 6.24 GB/s write in Blackmagic Disk Speed Test. The Verge’s 4TB M4 Max review unit measured about 7.34 GB/s sustained read and 7.97 GB/s sustained write. Good work though, interesting study. github.com/faisalmumtaz89…

English

Dan Woods@danveloper·18 Mar

x.com/i/article/2034…

ZXX

186

1.3K

660.6K

Descobrir

@cryptogoos @helloitsaustin @karpathy @farzyness @JayaGup10 @alex_whedon @subquadratic @UrataSyndicate