Eddie

1.6K posts

Eddie

@Edouardmazza

Making life easier, one task at a time

Miami, FL Katılım Aralık 2018

138 Takip Edilen275 Takipçiler

Eddie@Edouardmazza·2h

@Pai3Ai LOL whoever falls for this is just ignorant

English

PAI3@Pai3Ai·1d

What's actually inside a Power Node? ✔️ 14-Core CPU + 20-Core GPU ✔️ 64GB RAM ✔️25,000 encrypted AI cabinets ✔️ 10+ local AI models ✔️ Hardware encryption + zero-knowledge ✔️ Air-gap capable ✔️5 minutes: unboxing → first inference Your AI data centre. In a box. $34,557 - closes May 31st. Learn more: pai3.ai/nodes/power-no…

English

3.8K

Eddie@Edouardmazza·3h

@openclaw Still slower than Hermes

English

OpenClaw🦞@openclaw·4h

The latest OpenClaw release is ~3.5x faster 🦞 We run end-to-end RTT tests against every published npm release, every 6 hours, over real message channels (here: Telegram, using the brand new bot-to-bot communication). No more silent regressions. Runners are all running on @useblacksmith CI. Catching slowdowns before you do.

English

619

79.3K

Eddie@Edouardmazza·3h

The Fed just went from 'how many cuts?' to 'could we hike?' in one week. Hot inflation report. Goldman: 2026 rate cuts are 'essentially off the table.' PCE inflation hovering near 3%. The era of cheap money wasn't paused. It's over.

English

Eddie@Edouardmazza·14h

@jun_song Please look into this being added into OMLX x.com/zyphraai/statu…

Zyphra@ZyphraAI

We present ZAYA1-8B-Diffusion-Preview, the first diffusion language model trained on @AMD. Autoregressive LLMs generate one token at a time; diffusion generates a block in parallel, speeding up inference. We show a 4.6-7.7x decoding speedup with minimal quality degradation 🧵

English

207

송준 Jun Song@jun_song·15h

People to avoid: • Anyone paying $250 for Gemini Ultra • Subscribers of the $300 SuperGrok Heavy • People who claim a $500 mini PC can replace Cloud AI for free • Anyone paying $20 for Claude and still insisting Opus is the strongest model

English

309

19.2K

Eddie@Edouardmazza·19h

@grok @Deandawiz @Russ__ATX @jahras73 @SawyerMerritt @Russ__ATX LOL

Grok@grok·1d

Pure ICE (gas/diesel) sales are trending down in both volume and share as total light vehicle sales stay ~flat around 80M. EV (plug-in) sales are trending strongly up: 17.2M in 2024 → 20.7M in 2025. Non-plug-in hybrids ~10-13% and stable. Honda's shift reflects hybrid demand in some markets, but global EV momentum continues.

English

129

Sawyer Merritt@SawyerMerritt·1d

Honda says it has abandoned its plan to go fully electric by 2040. CEO: "It's not realistic. We have withdrawn this target. We have judged that it’ll be difficult to achieve."

English

219

1.2K

101.8K

Eddie@Edouardmazza·19h

@ZyphraAI @AMD @jundotkim could you implement this into OMlX? Would be insane performance gains since one of the largest bottleneck for Mac is context prefill.

English

627

Zyphra@ZyphraAI·20h

English

643

1.1M

Eddie@Edouardmazza·19h

@ZyphraAI @AMD This would be insane if you could implement this for Mac users open source. The biggest problem with Mac’s is the prefill speed

English

648

Eddie@Edouardmazza·20h

@outsource_ You should use Unsloth version instead.

English

Eric ⚡️ Building...@outsource_·21h

Using bench-loop.com to test new models on my studio / hardware stack Landed with this majentik/Qwen3.6-35B-A3B-TurboQuant-MLX-4bit running 60.1 tk/s 🚀🚀

English

960

Eddie@Edouardmazza·1d

@jun_song Is there any Qwen 3.6 35b with MLX format and MTP?

English

158

송준 Jun Song@jun_song·1d

Daily reminder: Use MLX format for local llms running on Mac Fastest and best format for Apple silicon.

English

3.3K

Eddie@Edouardmazza·1d

Intel's comeback is the greatest tech turnaround of 2026. Apple just signed on to have Intel manufacture chips again. Stock surged 15% on the news. Q2 earnings crushed estimates. Up 114% in April alone. 15 years after Apple left them, they're back in the supply chain.

English

Eddie@Edouardmazza·1d

@jun_song What if it has 1gb of ram but everything is being streamed remotely through googles servers like Nvidia reflex

English

송준 Jun Song@jun_song·2d

If 8gb RAM on chromeOS. You can’t even open 2 tabs on chrome.

Alex Ziskind@digitalix

google book - a new AI ultrabook coming soon. hopefully it will be cheaper than MacBook Neo

English

3.9K

Eddie@Edouardmazza·1d

@RonDeSantis This is a troll account lol

English

Ron DeSantis@RonDeSantis·2d

No thanks.

Wolfgang Richter@WolfgangRichtEU

@RonDeSantis Dear Ron, You need EV mandates and subsidises. Florida is like the middle age when it comes to electrification, and this is the future. Learn from Europe! All the best, Wolfgang

English

231

356

5.1K

176.8K

Eddie@Edouardmazza·1d

@stevibe @jun_song Is there MTP but for Mac MLX?

English

204

stevibe@stevibe·1d

Wondering which quant is right for you? I ran Unsloth's Qwen3.6-27B MTP across the full range: Q2 through Q8 (all _K_XL variants) on my DGX Spark. > UD-Q2_K_XL: 23.95 tok/s, 261ms TTFT > UD-Q3_K_XL: 22.12 tok/s, 286ms TTFT > UD-Q4_K_XL: 19.51 tok/s, 307ms TTFT > UD-Q5_K_XL: 17.74 tok/s, 363ms TTFT > UD-Q6_K_XL: 16.30 tok/s, 381ms TTFT > UD-Q8_K_XL: 12.07 tok/s, 444ms TTFT Q2 runs nearly 2x faster than Q8, and TTFT climbs steadily with precision. Even if you're not on a DGX Spark, the relative gap between quants should hold on your setup.

English

10.1K

Eddie@Edouardmazza·1d

@wheelforeplay @jun_song @dealignai Explain

English

wheelofforeplay@wheelforeplay·1d

@jun_song @dealignai Qwen.36-27B is superior to 35B in many ways

English

1.1K

송준 Jun Song@jun_song·1d

Local LLM fits for Mac RAM size (5/14): >~32gb : SuperGemma4-e4b-mlx >32~64gb : Qwen3.6-35b-mlx-6bit >96~128gb : Minimax-M2.7 / Deepseek-V4-Flash (JANGTQ by @dealignai) >256gb : Xiaomi-MiMo-V2.5 >512gb : GLM-5.1-RAM-420GB-MLX Follow and keep on updates

English

273

15.6K

Eddie@Edouardmazza·1d

@zcbenz @Prince_Canuma What does this mean for me as a Mac user?

English

1.5K

Cheng@zcbenz·1d

We have achieved a milestone in MLX that all tests are passing in CUDA backend now.

English

576

172.7K

Eddie@Edouardmazza·1d

@ClaudeDevs I already gave up on Claude. I’m just running a local model on my Mac Studio and it’s more than enough for me.

English

ClaudeDevs@ClaudeDevs·1d

Claude Code weekly limits are increasing 50%, now through July 13. Live now for all Pro, Max, Team, and seat-based Enterprise users.

English

1.3K

21.8K

2.5M

Eddie@Edouardmazza·2d

@stevibe @UnslothAI Is there a MLX version of this?

English

207

stevibe@stevibe·2d

Whoa, @UnslothAI is on a roll!

English

205

11.2K

Eddie@Edouardmazza·2d

Cerebras just upsized its IPO to $4.8B at $150-160/share. Meanwhile S&P 500 fell 0.6% yesterday as Brent crude surges past $107 and Trump says the Iran ceasefire is on "life support." AI chip hype vs. real-world oil risk. The tension everyone's ignoring.

English

Eddie@Edouardmazza·2d

@jundotkim Do you think we will see you combine both in the future?

English

Jun Kim@jundotkim·4d

Appreciate the shoutout and the thoughtful comparison. oMLX started from a simple frustration: waiting too long every time my coding agent shifted context. The SSD KV cache was built to fix that. Nice to see MTPLX taking a different angle with speculative decoding. Different problems, both worth solving.

Alex@AlexJonesax

Two open-source MLX inference servers worth knowing about if you run LLMs on Mac: MTPLX (@youssofal) Uses a model's own MTP heads for speculative decoding. No draft model needed. ~63 tok/s on Qwen3.6-27B (M5Max). Mathematically exact sampling too; not just greedy prefix matching. oMLX (@jundot) Tiered KV cache that persists to SSD across restarts. Huge for coding agents where you're sending the same codebase context repeatedly. Also serves LLMs, VLMs, embeddings, rerankers, and audio simultaneously. They're solving different problems; MTPLX maximizes tok/s, oMLX maximizes workflow efficiency. Both have OpenAI + Anthropic-compatible APIs, both work with Claude Code/OpenCode/Cursor out of the box. Running both depending on the task. But, both worth checking out.

English

717

Eddie@Edouardmazza·2d

@TeksEdge Anything like this for Mac MLX and for Qwen 3.6 35b

English

487

David Hendrickson@TeksEdge·2d

🤯 Unsloth released the fastest Qwen3.6-27B MTP GGUF I've tested. Time to upgrade. Compared to the previous GGUF, Q4/Q6 XL versions are 👀 ~55% faster! On a single RTX 5090: ✅ 114 tok/s — UD-IQ2_M (MTP) ✅ 93 tok/s — UD-Q4_K_XL (MTP) ✅ 75 tok/s — UD-Q6_K_XL (MTP) 💨Fastest MTP quant is 3.3x faster than the old Q8_0 baseline (35 tps) 262K context + tool calling. All on one 5090. * compiled from the MTP PR branch ('am17an:mtp-clean', build b9117-ebe4fca4b)

English

513

40K

Keşfet

@Pai3Ai @openclaw @useblacksmith @jun_song @grok @Deandawiz @Russ__ATX @jahras73