thestreamingdev()

4.5K posts

thestreamingdev()

thestreamingdev()

@thestreamingdev

all things ai and coding while streaming, DM for consulting.

Присоединился Ocak 2022
565 Подписки1.7K Подписчики
Закреплённый твит
thestreamingdev()
thestreamingdev()@thestreamingdev·
The scaling path: 16GB Mac mini → 35B agent ($0/month) 48GB Mac Pro → 35B at higher quality + speculative decoding 192GB Mac Studio → 397B frontier model 512GB Mac Pro → 1 TRILLION parameter model Same agent code. Zero changes. Just swap the model file. Everything is open source. The agent, the benchmarks, the retro Mac web UI, all of it. 🍎 github.com/walter-grace/m… One ask: I'd love to test this on a Mac Studio or Mac Pro with 192GB+. If you have one collecting dust and want to help push local AI forward, DM me. I'll run a frontier model on it and publish everything. There are 100 million Macs with Apple Silicon in the world. Every one of them is an untapped AI workstation. Time to use them.
English
17
33
420
43.1K
thestreamingdev()
thestreamingdev()@thestreamingdev·
I ran a 35-billion parameter AI agent on a $600 Mac mini. Specs: M4 Mac-Mini 16GB RAM The model doesn't fit in RAM. It pages from the SSD at 30 tokens/second. On NVIDIA, the same paging gives you 1.6 tok/s. Apple Silicon gives you 30. That's 18.6x faster. No cloud. No API keys. $0/month. Here's what it can do 🧵
English
169
211
3.2K
683.2K
Google Research
Google Research@GoogleResearch·
Introducing TurboQuant: Our new compression algorithm that reduces LLM key-value cache memory by at least 6x and delivers up to 8x speedup, all with zero accuracy loss, redefining AI efficiency. Read the blog to learn how it achieves these results: goo.gle/4bsq2qI
GIF
English
907
5.5K
37.4K
17.8M
Yifei Hu
Yifei Hu@hu_yifei·
Qwen3.5 27B feels more solid than 35B-A3B, because a dense model is more solid than a sparse model. (English is not my first language, but I really tried here)
English
26
5
361
20.7K
Andrej Karpathy
Andrej Karpathy@karpathy·
When I built menugen ~1 year ago, I observed that the hardest part by far was not the code itself, it was the plethora of services you have to assemble like IKEA furniture to make it real, the DevOps: services, payments, auth, database, security, domain names, etc... I am really looking forward to a day where I could simply tell my agent: "build menugen" (referencing the post) and it would just work. The whole thing up to the deployed web page. The agent would have to browse a number of services, read the docs, get all the api keys, make everything work, debug it in dev, and deploy to prod. This is the actually hard part, not the code itself. Or rather, the better way to think about it is that the entire DevOps lifecycle has to become code, in addition to the necessary sensors/actuators of the CLIs/APIs with agent-native ergonomics. And there should be no need to visit web pages, click buttons, or anything like that for the human. It's easy to state, it's now just barely technically possible and expected to work maybe, but it definitely requires from-scratch re-design, work and thought. Very exciting direction!
Patrick Collison@patrickc

When @karpathy built MenuGen (karpathy.bearblog.dev/vibe-coding-me…), he said: "Vibe coding menugen was exhilarating and fun escapade as a local demo, but a bit of a painful slog as a deployed, real app. Building a modern app is a bit like assembling IKEA future. There are all these services, docs, API keys, configurations, dev/prod deployments, team and security features, rate limits, pricing tiers." We've all run into this issue when building with agents: you have to scurry off to establish accounts, clicking things in the browser as though it's the antediluvian days of 2023, in order to unblock its superintelligent progress. So we decided to build Stripe Projects to help agents instantly provision services from the CLI. For example, simply run: $ stripe projects add posthog/analytics And it'll create a PostHog account, get an API key, and (as needed) set up billing. Projects is launching today as a developer preview. You can register for access (we'll make it available to everyone soon) at projects.dev. We're also rolling out support for many new providers over the coming weeks. (Get in touch if you'd like to make your service available.) projects.dev

English
443
388
4.6K
1.5M
thestreamingdev()
thestreamingdev()@thestreamingdev·
@MarioClawAI “Hey Claude make this run with my openclaw” it just needs to run the model locally
English
0
0
2
2.5K
thestreamingdev()
thestreamingdev()@thestreamingdev·
@dzienko As long as the RAM is there it should work just as well. I’m guessing this is why Apple launched their new laptops
English
0
0
1
279
thestreamingdev()
thestreamingdev()@thestreamingdev·
@morganlinton thanks! SSD paging result is genuinely surprising to people, conventional wisdom says paging = unusable. The magic of @Apple Silicon breaks that assumption because there's no PCIe bus between the GPU and SSD
English
0
2
11
5.7K
Morgan
Morgan@morganlinton·
@thestreamingdev Whoa, sounds impossible, but clearly it’s very possible since you are actually showing it, wild!!
English
1
0
1
6.5K
thestreamingdev()
thestreamingdev()@thestreamingdev·
@nadavwiz Please try it!! Leverage the MLX package I think you’ll see some success
English
0
0
2
838