

MiniMax (official)
1.5K posts

@MiniMax_AI
Agent: @MiniMaxAgent Token Plan: https://t.co/BDCycxepZw API: https://t.co/fHRdSV7BwZ Community: https://t.co/uhxxfLgkLU



I didn't touch TouchDesigner myself. Hermes agent learned it from scratch and built this: → navigated my desktop with computer use → figured out how to connect to TouchDesigner → read my reference images → iterated on the art with me in a self-learning loop → then saved what it learned as a reusable skill for the next image all powered by @MiniMax_AI M3 × Hermes Desktop Agent @NousResearch Here's a full breakdown 📽️


The US government, citing national security authorities, has issued an export control directive to suspend all access to Fable 5 and Mythos 5 by any foreign national, whether inside or outside the United States, including foreign national Anthropic employees. The net effect of this order is that we must abruptly disable Fable 5 and Mythos 5 for all our customers to ensure compliance. Access to all other Claude models is not affected. We apologize for this disruption to our customers. We believe this is a misunderstanding and are working to restore access as soon as possible. Read our full statement: anthropic.com/news/fable-myt…


Made some improvements on the decode path for MiniMax M3 by @MiniMax_AI on MLX-VLM Faster decode, slightly lighter footprint. Thanks to @ivanfioravanti for the PR 🚀 PR: github.com/Blaizzy/mlx-vl…






MiniMax M3 support added to mlx-vlm with MSA implementation! 🚀 Tested on M3 Ultra 512GB running at 24 tps with peak memory ~240GB. Now working on optimizing performance and adding ton of tests 💪 Model is here: huggingface.co/mlx-community/… PR is here: github.com/Blaizzy/mlx-vl…

MiniMax M3 can now be run locally!🔥 MiniMax-M3 is a new 428B (23B active) open model with 1M context that performs on par with Gemini 3.1 Pro. Run Dynamic 2-bit GGUF on 138GB RAM/VRAM or 3-bit on 165GB. GGUF: huggingface.co/unsloth/MiniMa… Guide: unsloth.ai/docs/models/mi…

MiniMax-M3 from @MiniMax_AI is now available on Together AI. It’s an open-weight native multimodal model with 1M context, MiniMax Sparse Attention, and thinking / non-thinking modes. Together AI is MiniMax’s preferred cloud partner, with inference optimizations delivering up to 125% higher throughput across concurrency levels.

🎉 Congrats to @MiniMax_AI on releasing MiniMax M3! Frontier coding and agentic capabilities, native image and video input, computer use, and a 1M-token context window, all in a single open model. At the heart of M3 is MSA, a new sparse attention architecture: instead of attending densely over the full KV cache, each query scores 128-token KV blocks and runs attention only over the top blocks. That is what makes 1M-token context practical to serve. M3 runs in vLLM with day-0 support, verified on NVIDIA and AMD hardware: ✨ MSA sparse attention with dedicated prefill and decode kernels ✨ 1M-token context serving with prefix caching and chunked prefill ✨ BF16 and MXFP8 checkpoints, with MoE backends for both Hopper and Blackwell ✨ Native multimodal input (image + video) ✨ Tool calling, reasoning parsing, and thinking-mode control for agent workloads Day-0 support like this is a true team effort. Grateful to the teams at @MiniMax_AI, @NVIDIAAI, @AIatAMD, and @inferact, and to the vLLM community for making it happen. 🙏 Deep dive into the implementation, kernel work, and deployment recipes: 🔗 vllm.ai/blog/2026-06-1…

@MiniMax_AI 400B ? I doubt whether you have the capability to train models with 800B parameters or even 1T parameters.