thestreamingdev(): "I ran a 35-billion parameter AI agent on a $600 Mac mini. Specs: M4 Mac-Mini 16"

Post

I ran a 35-billion parameter AI agent on a $600 Mac mini. Specs: M4 Mac-Mini 16GB RAM The model doesn't fit in RAM. It pages from the SSD at 30 tokens/second. On NVIDIA, the same paging gives you 1.6 tok/s. Apple Silicon gives you 30. That's 18.6x faster. No cloud. No API keys. $0/month. Here's what it can do 🧵

English

169

213

3.2K

686.4K

Morgan@morganlinton·1d

@thestreamingdev Whoa, sounds impossible, but clearly it’s very possible since you are actually showing it, wild!!

English

6.5K

thestreamingdev()@thestreamingdev·1d

@morganlinton thanks! SSD paging result is genuinely surprising to people, conventional wisdom says paging = unusable. The magic of @Apple Silicon breaks that assumption because there's no PCIe bus between the GPU and SSD

English

5.8K

Paylaş