Eddie

1.6K posts

Eddie banner
Eddie

Eddie

@Edouardmazza

Making life easier, one task at a time

Miami, FL Katılım Aralık 2018
138 Takip Edilen275 Takipçiler
Eddie
Eddie@Edouardmazza·
@Pai3Ai LOL whoever falls for this is just ignorant
English
0
0
0
6
PAI3
PAI3@Pai3Ai·
What's actually inside a Power Node? ✔️ 14-Core CPU + 20-Core GPU ✔️ 64GB RAM ✔️25,000 encrypted AI cabinets ✔️ 10+ local AI models ✔️ Hardware encryption + zero-knowledge ✔️ Air-gap capable ✔️5 minutes: unboxing → first inference Your AI data centre. In a box. $34,557 - closes May 31st. Learn more: pai3.ai/nodes/power-no…
PAI3 tweet media
English
7
38
43
3.8K
Eddie
Eddie@Edouardmazza·
@openclaw Still slower than Hermes
English
0
0
0
19
OpenClaw🦞
OpenClaw🦞@openclaw·
The latest OpenClaw release is ~3.5x faster 🦞 We run end-to-end RTT tests against every published npm release, every 6 hours, over real message channels (here: Telegram, using the brand new bot-to-bot communication). No more silent regressions. Runners are all running on @useblacksmith CI. Catching slowdowns before you do.
OpenClaw🦞 tweet media
English
75
53
619
79.3K
Eddie
Eddie@Edouardmazza·
The Fed just went from 'how many cuts?' to 'could we hike?' in one week. Hot inflation report. Goldman: 2026 rate cuts are 'essentially off the table.' PCE inflation hovering near 3%. The era of cheap money wasn't paused. It's over.
English
0
0
0
12
송준 Jun Song
송준 Jun Song@jun_song·
People to avoid: • Anyone paying $250 for Gemini Ultra • Subscribers of the $300 SuperGrok Heavy • People who claim a $500 mini PC can replace Cloud AI for free • Anyone paying $20 for Claude and still insisting Opus is the strongest model
English
34
10
309
19.2K
Grok
Grok@grok·
Pure ICE (gas/diesel) sales are trending down in both volume and share as total light vehicle sales stay ~flat around 80M. EV (plug-in) sales are trending strongly up: 17.2M in 2024 → 20.7M in 2025. Non-plug-in hybrids ~10-13% and stable. Honda's shift reflects hybrid demand in some markets, but global EV momentum continues.
English
1
0
29
129
Sawyer Merritt
Sawyer Merritt@SawyerMerritt·
Honda says it has abandoned its plan to go fully electric by 2040. CEO: "It's not realistic. We have withdrawn this target. We have judged that it’ll be difficult to achieve."
Sawyer Merritt tweet media
English
219
87
1.2K
101.8K
Eddie
Eddie@Edouardmazza·
@ZyphraAI @AMD @jundotkim could you implement this into OMlX? Would be insane performance gains since one of the largest bottleneck for Mac is context prefill.
English
0
0
1
627
Zyphra
Zyphra@ZyphraAI·
We present ZAYA1-8B-Diffusion-Preview, the first diffusion language model trained on @AMD. Autoregressive LLMs generate one token at a time; diffusion generates a block in parallel, speeding up inference. We show a 4.6-7.7x decoding speedup with minimal quality degradation 🧵
Zyphra tweet media
English
21
80
643
1.1M
Eddie
Eddie@Edouardmazza·
@ZyphraAI @AMD This would be insane if you could implement this for Mac users open source. The biggest problem with Mac’s is the prefill speed
English
0
0
0
648
Eddie
Eddie@Edouardmazza·
@outsource_ You should use Unsloth version instead.
English
1
0
1
50
Eric ⚡️ Building...
Eric ⚡️ Building...@outsource_·
Using bench-loop.com to test new models on my studio / hardware stack Landed with this majentik/Qwen3.6-35B-A3B-TurboQuant-MLX-4bit running 60.1 tk/s 🚀🚀
Eric ⚡️ Building... tweet media
English
2
0
13
960
Eddie
Eddie@Edouardmazza·
@jun_song Is there any Qwen 3.6 35b with MLX format and MTP?
English
1
0
0
158
송준 Jun Song
송준 Jun Song@jun_song·
Daily reminder: Use MLX format for local llms running on Mac Fastest and best format for Apple silicon.
English
9
1
54
3.3K
Eddie
Eddie@Edouardmazza·
Intel's comeback is the greatest tech turnaround of 2026. Apple just signed on to have Intel manufacture chips again. Stock surged 15% on the news. Q2 earnings crushed estimates. Up 114% in April alone. 15 years after Apple left them, they're back in the supply chain.
English
0
0
0
62
Eddie
Eddie@Edouardmazza·
@jun_song What if it has 1gb of ram but everything is being streamed remotely through googles servers like Nvidia reflex
English
0
0
0
20
stevibe
stevibe@stevibe·
Wondering which quant is right for you? I ran Unsloth's Qwen3.6-27B MTP across the full range: Q2 through Q8 (all _K_XL variants) on my DGX Spark. > UD-Q2_K_XL: 23.95 tok/s, 261ms TTFT > UD-Q3_K_XL: 22.12 tok/s, 286ms TTFT > UD-Q4_K_XL: 19.51 tok/s, 307ms TTFT > UD-Q5_K_XL: 17.74 tok/s, 363ms TTFT > UD-Q6_K_XL: 16.30 tok/s, 381ms TTFT > UD-Q8_K_XL: 12.07 tok/s, 444ms TTFT Q2 runs nearly 2x faster than Q8, and TTFT climbs steadily with precision. Even if you're not on a DGX Spark, the relative gap between quants should hold on your setup.
English
12
8
80
10.1K
송준 Jun Song
송준 Jun Song@jun_song·
Local LLM fits for Mac RAM size (5/14): >~32gb : SuperGemma4-e4b-mlx >32~64gb : Qwen3.6-35b-mlx-6bit >96~128gb : Minimax-M2.7 / Deepseek-V4-Flash (JANGTQ by @dealignai) >256gb : Xiaomi-MiMo-V2.5 >512gb : GLM-5.1-RAM-420GB-MLX Follow and keep on updates
English
20
20
273
15.6K
Cheng
Cheng@zcbenz·
We have achieved a milestone in MLX that all tests are passing in CUDA backend now.
Cheng tweet media
English
17
87
576
172.7K
Eddie
Eddie@Edouardmazza·
@ClaudeDevs I already gave up on Claude. I’m just running a local model on my Mac Studio and it’s more than enough for me.
English
0
0
0
27
ClaudeDevs
ClaudeDevs@ClaudeDevs·
Claude Code weekly limits are increasing 50%, now through July 13. Live now for all Pro, Max, Team, and seat-based Enterprise users.
ClaudeDevs tweet media
English
1.3K
2K
21.8K
2.5M
Eddie
Eddie@Edouardmazza·
Cerebras just upsized its IPO to $4.8B at $150-160/share. Meanwhile S&P 500 fell 0.6% yesterday as Brent crude surges past $107 and Trump says the Iran ceasefire is on "life support." AI chip hype vs. real-world oil risk. The tension everyone's ignoring.
English
0
0
0
88
Eddie
Eddie@Edouardmazza·
@jundotkim Do you think we will see you combine both in the future?
English
0
0
0
4
Jun Kim
Jun Kim@jundotkim·
Appreciate the shoutout and the thoughtful comparison. oMLX started from a simple frustration: waiting too long every time my coding agent shifted context. The SSD KV cache was built to fix that. Nice to see MTPLX taking a different angle with speculative decoding. Different problems, both worth solving.
Alex@AlexJonesax

Two open-source MLX inference servers worth knowing about if you run LLMs on Mac: MTPLX (@youssofal) Uses a model's own MTP heads for speculative decoding. No draft model needed. ~63 tok/s on Qwen3.6-27B (M5Max). Mathematically exact sampling too; not just greedy prefix matching. oMLX (@jundot) Tiered KV cache that persists to SSD across restarts. Huge for coding agents where you're sending the same codebase context repeatedly. Also serves LLMs, VLMs, embeddings, rerankers, and audio simultaneously. They're solving different problems; MTPLX maximizes tok/s, oMLX maximizes workflow efficiency. Both have OpenAI + Anthropic-compatible APIs, both work with Claude Code/OpenCode/Cursor out of the box. Running both depending on the task. But, both worth checking out.

English
3
1
11
717
Eddie
Eddie@Edouardmazza·
@TeksEdge Anything like this for Mac MLX and for Qwen 3.6 35b
English
0
0
1
487
David Hendrickson
David Hendrickson@TeksEdge·
🤯 Unsloth released the fastest Qwen3.6-27B MTP GGUF I've tested. Time to upgrade. Compared to the previous GGUF, Q4/Q6 XL versions are 👀 ~55% faster! On a single RTX 5090: ✅ 114 tok/s — UD-IQ2_M (MTP) ✅ 93 tok/s — UD-Q4_K_XL (MTP) ✅ 75 tok/s — UD-Q6_K_XL (MTP) 💨Fastest MTP quant is 3.3x faster than the old Q8_0 baseline (35 tps) 262K context + tool calling. All on one 5090. * compiled from the MTP PR branch ('am17an:mtp-clean', build b9117-ebe4fca4b)
David Hendrickson tweet media
English
31
49
513
40K