Tarjei Mandt

2.2K posts

Tarjei Mandt

@kernelpool

Sydney, Australia Se unió Ağustos 2009

608 Siguiendo17.4K Seguidores

Tarjei Mandt retuiteado

N8 Programs@N8Programs·7 Mar

Recently, @awnihannun asserted that 'According to benchmarks Qwen3.5 4B is as good as GPT 4o.' This drew controversy: Is the 4B just benchmaxxed? How could a 4B be as good as GPT-4o? I tried to test this scientifically. The answer to the question is likely: yes, in most cases.

English

113

1.1K

349.5K

Tarjei Mandt@kernelpool·28 Şub

@awnihannun Thanks for all the great work on MLX! Good luck on what’s next!

English

2.6K

Awni Hannun@awnihannun·28 Şub

Today is my last day at Apple. Building MLX with our amazing team and community has been an absolute pleasure. It's still early days for AI on Apple silicon. Apple makes the best consumer hardware on the planet. There's so much potential for it to be the leading platform for AI. And I'm confident MLX will continue to have a big role in that. To the future: MLX remains in the exceptionally capable hands of our team including @angeloskath, @zcbenz, @DiganiJagrit, @NasFilippova, @trebolloc (and others not on X). Follow them or @shshnkp for future updates.

English

260

2.2K

396.1K

Tarjei Mandt retuiteado

Ivan Krstić@radian·26 Şub

🔺NEW: iPhone and iPad are now the first and only generally-available devices to meet the exacting security requirements for handling classified NATO information. apple.com/newsroom/2026/…

English

342

45.2K

Tarjei Mandt retuiteado

l33tdawg@l33tdawg·21 Şub

Over the CNY holidays, I decided to build something that imho is 'peak agentic AI' 🤣 - the world's first self-evolving CTF platform! AI agents design, validate, calibrate, and evolve security challenges autonomously. levelupctf.com Here's the full story 🧵

English

19.2K

Tarjei Mandt@kernelpool·22 Şub

Setting the bar for the next generation! 😄

TSN@TSN_Sports

Johannes Høsflot Klæbo takes a selfie with his SIX GOLD MEDALS from #MilanoCortina2026 🤯 He became the first athlete ever to win that many Olympic golds at a single Winter Games.

English

1.4K

Tarjei Mandt@kernelpool·17 Şub

@blacktop__ Imagine being the AI and seeing the GPU getting restricted

English

1.4K

Blacktop@blacktop__·17 Şub

Gave Claude my `ipsw` tool and my `ida-mcp-rs` and asked it what Apple adding `com.apple.developer.gpu-restricted` to `com.apple.WebKit.WebContent.EnhancedSecurity` does and here's the report it made. We are so cooked chat 🪦 gist.github.com/blacktop/f2606…

English

7.3K

Tarjei Mandt@kernelpool·16 Şub

@AI_Homelab @ivanfioravanti Yeah I did perplexity test this prior to uploading in case I broke something 🙂

English

Simon@AI_Homelab·16 Şub

@ivanfioravanti @kernelpool Nice to see it checked through perplexity. Maybe we'll see more usage of DWQ quantization in MLX in the future. I think this is the first time I actually see a "proof" for it here on X. 👌

English

Ivan Fioravanti ᯅ@ivanfioravanti·15 Şub

MLX DWQ quantization works! Here Perplexity for JoyAI-LLM_Flash! Uploaded on mlx-community by @kernelpool

English

Tarjei Mandt@kernelpool·16 Şub

@ivanfioravanti Awesome! 🚀

English

421

Tarjei Mandt@kernelpool·16 Şub

@thedarthsider @LiMzba @ivanfioravanti @AI_Homelab @UnslothAI I will, but the Mac Studios have been too busy lately 😞

English

darthsider@thedarthsider·15 Şub

@LiMzba @ivanfioravanti @AI_Homelab @UnslothAI I bet @kernelpool can do it. Isn’t M2 and M2.1 DWQ done by him?

English

Ivan Fioravanti ᯅ@ivanfioravanti·13 Şub

BOOM! Download started!

MiniMax (official)@MiniMax_AI

MiniMax-M2.5 is now open source. Trained with reinforcement learning across hundreds of thousands of complex real-world environments, it delivers SOTA performance in coding, agentic tool use, search, and office workflows. Hugging Face: huggingface.co/MiniMaxAI/Mini… GitHub: github.com/MiniMax-AI/Min… Coding Plan: platform.minimax.io/subscribe/codi… Intelligence with Everyone

English

6.4K

Tarjei Mandt@kernelpool·14 Şub

@ivanfioravanti The sparse attention is slowing down the prefill, however, it can be fixed

English

217

Ivan Fioravanti ᯅ@ivanfioravanti·14 Şub

This is what I mean. Benchmarking 64k context on M3 Ultra: Prompt: 63976 tokens, 45.1 tokens-per-sec Generation: 200 tokens, 12.1 tokens-per-sec Peak memory: 471.61 GB Total wall time: 1492s 👀

English

1.7K

Ivan Fioravanti ᯅ@ivanfioravanti·14 Şub

GLM-5 can't be run locally on Apple Silicon. Even at 4bit quantization it's too slow. We need more GPU power and memory bandwidth for model of this size.

English

123

13.2K

Tarjei Mandt@kernelpool·14 Şub

@_rezin_ #SystemPromptProTip

QME

Sam Collinson@_rezin_·14 Şub

@kernelpool unless it’s a kiwi model and it’s heaps of spraying as an itick victer

English

103

Tarjei Mandt@kernelpool·13 Şub

That feeling when your SOTA model suggests heap spraying as an attack vector in 2026

English

1.6K

Tarjei Mandt@kernelpool·14 Şub

@awnihannun @digitalix Generation aside, a sparse attention kernel would also help :) Currently prompt processing slows down quite a bit for longer context

English

Awni Hannun@awnihannun·13 Şub

@digitalix Thanks for the results, clearly we have some work to do! Also you can use `mlx_lm.benchmark` to test tensor parallel scaling while ensuring it generates the same number of tokens for each setup. It will be slightly more accurate.

English

1.3K

Alex Ziskind@digitalix·13 Şub

GLM5 4bit scaling on a cluster of M3 Ultra Mac Studios - using MLX.

English

135

11.1K

Tarjei Mandt retuiteado

Awni Hannun@awnihannun·12 Şub

GLM-5 runs with mlx-lm on a single 512GB M3 Ultra in Q4. It's quite good in my initial testing and pretty fast as well. It generated a highly functional space invaders game using 7.1k tokens at 15.4 tok/s and 419GB memory. Thanks to @ActuallyIsaak and @kernelpool for the port.

Z.ai@Zai_org

Introducing GLM-5: From Vibe Coding to Agentic Engineering GLM-5 is built for complex systems engineering and long-horizon agentic tasks. Compared to GLM-4.5, it scales from 355B params (32B active) to 744B (40B active), with pre-training data growing from 23T to 28.5T tokens. Try it now: chat.z.ai Weights: huggingface.co/zai-org/GLM-5 Tech Blog: z.ai/blog/glm-5 OpenRouter (Previously Pony Alpha): openrouter.ai/z-ai/glm-5 Rolling out from Coding Plan Max users: z.ai/subscribe

English

477

59.6K

Tarjei Mandt@kernelpool·6 Şub

Plenty of great improvements in mlx-lm 0.30.6! Here’s Kimi-K2.5 running on 2 x M3 Ultra, up to 128k context 🚀

Awni Hannun@awnihannun

Latest mlx-lm is out: - New models: Kimi K2.5, Step3.5 flash, LongCat Flash lite thanks to @kernelpool - Support for distributed inference with mlx_lm.server thanks to @angeloskath - Much faster and more memory efficient DeepSeek v3 (and other MLA-based models)

English

8.4K

Tarjei Mandt@kernelpool·6 Şub

@ivanfioravanti T6051 and T6052 in 26.3 RC, so maybe soon?

English

1.1K

Ivan Fioravanti ᯅ@ivanfioravanti·6 Şub

What is a good notebook for Linux? Tired of waiting for M5 Max 🤷🏻‍♂️

English

5.5K

Tarjei Mandt@kernelpool·3 Şub

@ivanfioravanti @RickRossTN It’s almost done uploading

English

152

Ivan Fioravanti ᯅ@ivanfioravanti·3 Şub

@RickRossTN I've used a local conversion. @kernelpool created the mlx-communicty version. I think it stopped uploading due to a bug that is fixed in a PR.

English

138

Ivan Fioravanti ᯅ@ivanfioravanti·3 Şub

Step-3.5-Flash in action on MLX with OpenCode on a single (distributed testing in progress!) M3 Ultra to create a snake game! 🔥 6bit quantization. Perfect tool calling. Fast & powerful coding model! Recommended Inference Settings: Temperature: 1.0 Top-p: 0.95 Top-k: 40 🧵