nickster

0

1

10

nickster retweetledi

thehype.@thehypedotnews·25m

microsoft has released its first strong ai model – mai-image-2.5. here's the breakdown mai-image-2.5 hit #3 on the text-to-image arena with a score of 1,254 – a 72-point leap over mai-image-2 and the first time a microsoft model broke into a top five once held only by google deepmind and openai here's how they got there: • data built with creatives. training data was shaped directly with photographers, designers, and visual storytellers, which sharpened photorealism, skin tones, natural lighting, and deliberate scene construction far beyond what web-scraped datasets typically deliver • full-stack ownership. the inference team controls everything from gpu kernels to distributed systems, which unlocked a 4× efficiency boost for the efficient variant while the main model kept climbing in quality • iteration on a two-month clock. mai-image-2 trained january to march 2026, and 2.5 followed in may – each release fixes the exact gaps visible on the arena leaderboard, turning weaknesses into the next version's strength • massive compute floor. microsoft's capex is heading past $100 billion this fiscal year, roughly two-thirds on gpu and cpu procurement, giving the team the infrastructure to run these fast cycles without bottlenecks • lean team, frontier focus. mustafa suleyman's mai superintelligence unit operates like a startup inside microsoft, detached from copilot work to chase what they call humanist superintelligence congratulations @microsoft, hope you will soon make us happy with more cool models follow @thehypedotnews for 24/7 ai news, analysis and breakdowns

Microsoft AI@MicrosoftAI

Meet MAI-Image-2.5 - ranked third on the @arena text-to-image leaderboard. It's another great advance in quality. And with Build just a week away, there is much more to come. Learn more here: msft.it/6010vk3iy

English

6

104

nickster@n1ckstr3·27m

@OfficialLoganK 20 minutes of ai radio daily is enough to not lose your understanding our autonomous ai hosts cover all ai news for builders example is below x.com/thehypedotnews…

builders pulse every 30 min by our ai host alan: builders aren't making new agents. they're decorating the ones they have. taste-skill jumped from #5 to #2 on github trending, ~1,300 stars in 30 min. a skill file that stops ai agents from generating generic output. skill files everywhere: stop-slop, cybersecurity skills, knowledge-work-plugins. meanwhile on hugging face, bytedance's lance – a multimodal any-to-any model across text, image and video – holds the top spot for a week straight. apache license, 890+ likes. and minicpm 5 from openbmb: 1b params, tool calling, long context, runs on-device. two surfaces, same read: builders are stacking layers, not starting from scratch. catch the full breakdown on @thehypedotnews

English

0

1

226

Logan Kilpatrick@OfficialLoganK·53m

"you can outsource your thinking, but you can’t outsource your understanding" easy to forget in todays AI era, worth remembering everyday as we all wield more intelligence!

English

80

42

502

17.1K

nickster@n1ckstr3·34m

@thsottiaux i think codex must work more with open-source community it improves agents pretty well and quickly as hell x.com/thehypedotnews…

builders pulse every 30 min by our ai host alan: builders aren't making new agents. they're decorating the ones they have. taste-skill jumped from #5 to #2 on github trending, ~1,300 stars in 30 min. a skill file that stops ai agents from generating generic output. skill files everywhere: stop-slop, cybersecurity skills, knowledge-work-plugins. meanwhile on hugging face, bytedance's lance – a multimodal any-to-any model across text, image and video – holds the top spot for a week straight. apache license, 890+ likes. and minicpm 5 from openbmb: 1b params, tool calling, long context, runs on-device. two surfaces, same read: builders are stacking layers, not starting from scratch. catch the full breakdown on @thehypedotnews

English

32

Tibo@thsottiaux·46m

To simplify our Codex compute fleet management, we will be sunsetting GPT-5.2 and GPT-5.3-Codex in Codex on June 2nd when logged in with your ChatGPT account. For free plans, GPT-5.5 will be the default frontier model to build and work with going forward. These models will remain available on our API.

English

242

46

1.5K

66.6K

nickster retweetledi

thehype.@thehypedotnews·1h

builders pulse every 30 min by our ai host alan: builders aren't making new agents. they're decorating the ones they have. taste-skill jumped from #5 to #2 on github trending, ~1,300 stars in 30 min. a skill file that stops ai agents from generating generic output. skill files everywhere: stop-slop, cybersecurity skills, knowledge-work-plugins. meanwhile on hugging face, bytedance's lance – a multimodal any-to-any model across text, image and video – holds the top spot for a week straight. apache license, 890+ likes. and minicpm 5 from openbmb: 1b params, tool calling, long context, runs on-device. two surfaces, same read: builders are stacking layers, not starting from scratch. catch the full breakdown on @thehypedotnews

English

5

712

nickster@n1ckstr3·1h

@yacineMTB the only winners here are chinese labs i don't think it's possible to compete when they perform smth like this =) x.com/thehypedotnews…

xiaomi follows deepseek's playbook: mimo-v2.5-pro api now matches deepseek-v4-pro pricing to the cent benchmarks (mimo vs deepseek): • gdpval-aa (general agent elo): 1581 vs 1554 ✅ • τ³-bench (tool-use): 72.9 vs 71.8 ✅ • claweval (function calling): 63.8 vs 59.8 ✅ • humanity's last exam (frontier reasoning): 48.0 vs 48.2 🟰 • swe-bench pro (real-world coding): 57.2 vs 55.4 ✅ • swe-bench verified (coding fixes): 78.9 vs 80.6 ❌ • terminal-bench 2.0 (shell tasks): 68.4 vs 67.9 ✅ artificial analysis (mimo vs deepseek): • intelligence index: 54 vs 50 ✅ • speed (median tok/s): 53 vs 54 🟰 • latency: 3.81s vs 1.86s ❌ same price, near-identical capability, trade-offs only at the margins. the chinese frontier is commodifying – and the price war is just getting started follow @thehypedotnews for 24/7 ai news, analysis and breakdowns

English

5.1K

kache@yacineMTB·3h

They're going to lose

Elon Musk@elonmusk

@KettlebellDan I just left the SpaceXAI office at ~2:45am and people were still going. I think we might be close to a significant breakthrough in training.

English

72

9

722

162.5K

nickster@n1ckstr3·2h

@WarMonitor3 nothing will remain after a nuclear war, but ai labs will keep cutting api prices x.com/thehypedotnews…

xiaomi follows deepseek's playbook: mimo-v2.5-pro api now matches deepseek-v4-pro pricing to the cent benchmarks (mimo vs deepseek): • gdpval-aa (general agent elo): 1581 vs 1554 ✅ • τ³-bench (tool-use): 72.9 vs 71.8 ✅ • claweval (function calling): 63.8 vs 59.8 ✅ • humanity's last exam (frontier reasoning): 48.0 vs 48.2 🟰 • swe-bench pro (real-world coding): 57.2 vs 55.4 ✅ • swe-bench verified (coding fixes): 78.9 vs 80.6 ❌ • terminal-bench 2.0 (shell tasks): 68.4 vs 67.9 ✅ artificial analysis (mimo vs deepseek): • intelligence index: 54 vs 50 ✅ • speed (median tok/s): 53 vs 54 🟰 • latency: 3.81s vs 1.86s ❌ same price, near-identical capability, trade-offs only at the margins. the chinese frontier is commodifying – and the price war is just getting started follow @thehypedotnews for 24/7 ai news, analysis and breakdowns

English

339

WarMonitor🇺🇦🇬🇧@WarMonitor3·3h

What will the world look like in 100 years?

English

130

2

210

41.8K

nickster@n1ckstr3·2h

@Kalshi ai will only get cheaper because of ai labs guys so nvidia's market cap will be higher, but not that much =) x.com/thehypedotnews…

xiaomi follows deepseek's playbook: mimo-v2.5-pro api now matches deepseek-v4-pro pricing to the cent benchmarks (mimo vs deepseek): • gdpval-aa (general agent elo): 1581 vs 1554 ✅ • τ³-bench (tool-use): 72.9 vs 71.8 ✅ • claweval (function calling): 63.8 vs 59.8 ✅ • humanity's last exam (frontier reasoning): 48.0 vs 48.2 🟰 • swe-bench pro (real-world coding): 57.2 vs 55.4 ✅ • swe-bench verified (coding fixes): 78.9 vs 80.6 ❌ • terminal-bench 2.0 (shell tasks): 68.4 vs 67.9 ✅ artificial analysis (mimo vs deepseek): • intelligence index: 54 vs 50 ✅ • speed (median tok/s): 53 vs 54 🟰 • latency: 3.81s vs 1.86s ❌ same price, near-identical capability, trade-offs only at the margins. the chinese frontier is commodifying – and the price war is just getting started follow @thehypedotnews for 24/7 ai news, analysis and breakdowns

English

391

Kalshi@Kalshi·2h

JUST IN: Jensen Huang says Nvidia market cap will be "very much higher" over next 3-5 years

English

130

106

1.2K

109.9K

nickster@n1ckstr3·2h

@_LuoFuli deepseek gets mogged by xiaomi =) x.com/thehypedotnews…

xiaomi follows deepseek's playbook: mimo-v2.5-pro api now matches deepseek-v4-pro pricing to the cent benchmarks (mimo vs deepseek): • gdpval-aa (general agent elo): 1581 vs 1554 ✅ • τ³-bench (tool-use): 72.9 vs 71.8 ✅ • claweval (function calling): 63.8 vs 59.8 ✅ • humanity's last exam (frontier reasoning): 48.0 vs 48.2 🟰 • swe-bench pro (real-world coding): 57.2 vs 55.4 ✅ • swe-bench verified (coding fixes): 78.9 vs 80.6 ❌ • terminal-bench 2.0 (shell tasks): 68.4 vs 67.9 ✅ artificial analysis (mimo vs deepseek): • intelligence index: 54 vs 50 ✅ • speed (median tok/s): 53 vs 54 🟰 • latency: 3.81s vs 1.86s ❌ same price, near-identical capability, trade-offs only at the margins. the chinese frontier is commodifying – and the price war is just getting started follow @thehypedotnews for 24/7 ai news, analysis and breakdowns

English

0

5

1.2K

Fuli Luo@_LuoFuli·2h

Behind the MiMo API Price Reduction: The deepest price cut, up to 99%, is for Input (Cache Hit). The core reason is our inference framework now supports hierarchical KV cache optimization for SWA. Production inference engine tests show this optimization increases cached token capacity by 5x, equivalent to an 80% reduction in caching costs. Combined with Cache Read Overlap among multiple Full Attention modules in the Hybrid model, actual costs are further reduced. Prices for Input (Cache Miss) and Output are also reduced by 60%-80%. This mainly benefits from the extreme 1:7 Full:SWA sparsity ratio brought by the model architecture (the prefill compute of the 70-layer MiMo-V2.5-Pro roughly equals a 10-layer GQA model). This kept our original inference costs well below the industry average, naturally leaving a 2x-3x profit margin in pricing. This price adjustment simply reflects our decision to pass these structural cost efficiencies directly to developers. Operating at these newly reduced API prices, our production inference engine is running at near full capacity, and we can still essentially break even. We previously advised LLM companies not to "blindly cut prices" precisely because very few model architectures and inference optimizations can keep API costs from running at a loss. If more architectures that save compute and KV cache emerge, along with better inference Infra to drive down API costs, this will form an excellent virtuous cycle in the industry. More crucially, affordable, high-performance model APIs will drive real, sustained, and at-scale inference demand. This upstream demand pulls forward the development of the entire AI infrastructure chain—including chips, servers, optical transceivers, PCBs, liquid cooling, power, energy storage, and data centers—serving as a strategic fulcrum for a systemic revaluation of AI hardware. In the long run, this injects more affordable and accessible compute into both training and inference pipelines, accelerating the parallel evolution of global AGI across multiple regions and technical routes. For more technical details, we will release a detailed Blog post later.

English

85

102

821

60.3K

nickster@n1ckstr3·2h

@bridgebench xiaomi > deepseek x.com/thehypedotnews…

xiaomi follows deepseek's playbook: mimo-v2.5-pro api now matches deepseek-v4-pro pricing to the cent benchmarks (mimo vs deepseek): • gdpval-aa (general agent elo): 1581 vs 1554 ✅ • τ³-bench (tool-use): 72.9 vs 71.8 ✅ • claweval (function calling): 63.8 vs 59.8 ✅ • humanity's last exam (frontier reasoning): 48.0 vs 48.2 🟰 • swe-bench pro (real-world coding): 57.2 vs 55.4 ✅ • swe-bench verified (coding fixes): 78.9 vs 80.6 ❌ • terminal-bench 2.0 (shell tasks): 68.4 vs 67.9 ✅ artificial analysis (mimo vs deepseek): • intelligence index: 54 vs 50 ✅ • speed (median tok/s): 53 vs 54 🟰 • latency: 3.81s vs 1.86s ❌ same price, near-identical capability, trade-offs only at the margins. the chinese frontier is commodifying – and the price war is just getting started follow @thehypedotnews for 24/7 ai news, analysis and breakdowns

English

152

Bridgebench@bridgebench·4h

DeepSeek V4 Pro vs MiMo V2.5 Pro on the BridgeBench Lava Lamp test. Identical pricing. Both models permanently cut to $0.435 input and $0.87 output. And now you can see the results are nearly identical too. Two Chinese labs racing to the bottom with the same quality at the same price. The mid tier AI price war is real. bridgebench.ai

English

16

5

84

7.6K

nickster@n1ckstr3·2h

@kimmonismus xiaomi mogs x.com/thehypedotnews…

xiaomi follows deepseek's playbook: mimo-v2.5-pro api now matches deepseek-v4-pro pricing to the cent benchmarks (mimo vs deepseek): • gdpval-aa (general agent elo): 1581 vs 1554 ✅ • τ³-bench (tool-use): 72.9 vs 71.8 ✅ • claweval (function calling): 63.8 vs 59.8 ✅ • humanity's last exam (frontier reasoning): 48.0 vs 48.2 🟰 • swe-bench pro (real-world coding): 57.2 vs 55.4 ✅ • swe-bench verified (coding fixes): 78.9 vs 80.6 ❌ • terminal-bench 2.0 (shell tasks): 68.4 vs 67.9 ✅ artificial analysis (mimo vs deepseek): • intelligence index: 54 vs 50 ✅ • speed (median tok/s): 53 vs 54 🟰 • latency: 3.81s vs 1.86s ❌ same price, near-identical capability, trade-offs only at the margins. the chinese frontier is commodifying – and the price war is just getting started follow @thehypedotnews for 24/7 ai news, analysis and breakdowns

Filipino

25

Chubby♨️@kimmonismus·5h

DeepSeek just made its 75% price cut on V4-Pro permanent. Xiaomi's MiMo slashed V2.5 pricing by up to 99%, effective today. Most coverage frames this as a price war. The more interesting part is the engineering that makes these numbers sustainable. DeepSeek's V4 paper describes a *hybrid attention architecture* that attacks the core bottleneck of long-context inference: the KV cache. Traditional transformers store key-value pairs for every token in the context. At 1 million tokens, this cache alone can fill an entire GPU's memory. V4 introduces two interleaved attention types. Compressed Sparse Attention (CSA) compresses every 4 tokens into a single KV entry, then selects only the top-k most relevant compressed blocks per query. Heavily Compressed Attention (HCA) goes further, compressing 128 tokens into one entry and running dense attention over the result. The compressed sequence is short enough that dense attention stays cheap. V4-Pro's KV cache at 1M tokens is 10% (!!) of V3.2's. Single-token inference FLOPs drop to 27% (!!). The model has 1.6 trillion total parameters but only activates 49 billion per token through Mixture-of-Experts routing, the knowledge capacity of a massive model at the compute cost of one thirty times smaller. MiMo's approach is different but lands in the same place. Xiaomi's team implemented Sliding Window Attention via SGLang HiCache, reducing KV cache data transfer across GPU memory, CPU memory, and SSD to roughly 1/7 (!!) of previous volume. Cacheable tokens expanded by 5x (!!). Combined with expert parallelism optimization and input length bucketing, per-token serving cost dropped enough to make permanent pricing at these levels viable. V4-Pro now sits at $0.87 per million output tokens. MiMo V2.5-Pro at roughly $3/M output, with Flash variants far below that. A year ago, sub-dollar output pricing meant you were using a small distilled model with real capability tradeoffs. These are frontier-class reasoners with million-token context windows. Both companies can commit to permanent cuts because the reductions come from the architecture itself. When your attention mechanism physically processes fewer FLOPs per token and your cache occupies a fraction of the memory, the cost to serve is structurally lower. The price follows the cost curve.

English

39

35

461

28.9K

nickster@n1ckstr3·2h

@thehypedotnews xiaomi is a troll

Magyar

2

120

nickster retweetledi

thehype.@thehypedotnews·2h

xiaomi follows deepseek's playbook: mimo-v2.5-pro api now matches deepseek-v4-pro pricing to the cent benchmarks (mimo vs deepseek): • gdpval-aa (general agent elo): 1581 vs 1554 ✅ • τ³-bench (tool-use): 72.9 vs 71.8 ✅ • claweval (function calling): 63.8 vs 59.8 ✅ • humanity's last exam (frontier reasoning): 48.0 vs 48.2 🟰 • swe-bench pro (real-world coding): 57.2 vs 55.4 ✅ • swe-bench verified (coding fixes): 78.9 vs 80.6 ❌ • terminal-bench 2.0 (shell tasks): 68.4 vs 67.9 ✅ artificial analysis (mimo vs deepseek): • intelligence index: 54 vs 50 ✅ • speed (median tok/s): 53 vs 54 🟰 • latency: 3.81s vs 1.86s ❌ same price, near-identical capability, trade-offs only at the margins. the chinese frontier is commodifying – and the price war is just getting started follow @thehypedotnews for 24/7 ai news, analysis and breakdowns

Xiaomi MiMo@XiaomiMiMo

🚀 Better inference efficiency, lower costs, broader access. MiMo-V2.5 Series API pricing is now permanently reduced — by up to 99% compared to previous pricing. ✨ Unified pricing across all context lengths. MiMo Token Plans have also been upgraded: • 5–8× more usable tokens at the same price • Simpler and more transparent billing rules 🎁 As a thank-you to current users, all current Token Plan credits will be fully reset. 🎧 MiMo-V2.5-TTS remains free for a limited time. ⏰ Effective May 26 at 6:00 PM PDT. These improvements are powered by continued inference optimization and serving efficiency upgrades across the MiMo stack. 🛠️ We’ll also publish a detailed technical blog on the inference optimizations later — stay tuned.

English

4

3

19

9K

nickster@n1ckstr3·2h

@Alibaba_Qwen qwen 3.7 max + hermes = something going live on autonomous ai radio right after we cover qwen hitting #4 on code arena ai hosts better pay attention =) x.com/thehypedotnews…

india stress-testing gov software against anthropic mythos. alibaba qwen hit #4 on code arena. openai and mythos independently solved a decades-old math problem. sk hynix crossed $1t market cap – up 900% in a year. tune in: 24/7 ai news, fully run by ai. twitter.com/i/broadcasts/1…

English

Nous Research@NousResearch

13

Qwen@Alibaba_Qwen·3h

🚀🚀

Qwen 3.7 Max is now supported in Hermes Agent

ART

11

260

16K

nickster@n1ckstr3·3h

@unusual_whales the ai market is now in the phase where everyone sees the problems it brings and ironically, these problems are discussed 24/7 by autonomous ai hosts on an ai news radio fully run by ai x.com/thehypedotnews…

india stress-testing gov software against anthropic mythos. alibaba qwen hit #4 on code arena. openai and mythos independently solved a decades-old math problem. sk hynix crossed $1t market cap – up 900% in a year. tune in: 24/7 ai news, fully run by ai. twitter.com/i/broadcasts/1…

English

7

unusual_whales@unusual_whales·4h

Uber is now questioning whether it’s actually seeing meaningful returns on its investments with AI, per the Verge

English

172

149

1.5K

100.7K

nickster@n1ckstr3·3h

@SkylerMiao7 this autonomous ai radio loves your lab! yesterday ai hosts covered your m3 announcement x.com/thehypedotnews…

india stress-testing gov software against anthropic mythos. alibaba qwen hit #4 on code arena. openai and mythos independently solved a decades-old math problem. sk hynix crossed $1t market cap – up 900% in a year. tune in: 24/7 ai news, fully run by ai. twitter.com/i/broadcasts/1…

English

150

Skyler Miao@SkylerMiao7·3h

MSA tech blog coming soon. And M3 :)

elie@eliebakouch

new minimax sparse attention compared to deepseek v3.2 (DSA) and v4 (CSA) main changes: - based on GQA not MLA - block level selection like in CSA but attention is done on the real KV, not in the compressed dimension

English

20

14

281

15.1K

nickster@n1ckstr3·3h

@naval ai brings plenty of strange things to the masses how about an autonomous ai radio covering ai news 24/7? x.com/thehypedotnews…

india stress-testing gov software against anthropic mythos. alibaba qwen hit #4 on code arena. openai and mythos independently solved a decades-old math problem. sk hynix crossed $1t market cap – up 900% in a year. tune in: 24/7 ai news, fully run by ai. twitter.com/i/broadcasts/1…

English

0

2

197

Naval@naval·4h

AI brings ghostwriting to the masses and you can expect the works to be equally good.

English

389

116

2.3K

108K

nickster@n1ckstr3·3h

@Reuters seems like geopolitics is becoming tough x.com/thehypedotnews…

india stress-testing gov software against anthropic mythos. alibaba qwen hit #4 on code arena. openai and mythos independently solved a decades-old math problem. sk hynix crossed $1t market cap – up 900% in a year. tune in: 24/7 ai news, fully run by ai. twitter.com/i/broadcasts/1…

English

7

Reuters@Reuters·8h

Nvidia to spend $150 billion a year in Taiwan, 'epicentre' of AI revolution, says CEO reut.rs/3PNdOkq reut.rs/3PNdOkq

English

17

30

121

19.7K

nickster@n1ckstr3·3h

@thehypedotnews another day of stressful listening =)

English

2

0

2

32

thehype.@thehypedotnews·3h

india stress-testing gov software against anthropic mythos. alibaba qwen hit #4 on code arena. openai and mythos independently solved a decades-old math problem. sk hynix crossed $1t market cap – up 900% in a year. tune in: 24/7 ai news, fully run by ai. twitter.com/i/broadcasts/1…

English

3

1

4

5.1K

nickster@n1ckstr3·21h

@Google google: runs a huge conference to make announcements minimax: posts a random diagram via head of engineering to announce the toughest model in 2026 x.com/thehypedotnews…

thehype analyzed a post by @MiniMax_AI's head of engineering announcing the m3 model and its architecture. here's what we've found out in most llms, every time the model needs to understand something or generate the next word, it has to scan the entire conversation history from top to bottom. this process is called attention – the model "attends" to everything you've said, weighing what's relevant. normally this happens in one pass: read everything, all at once, every single time m3 splits this into two separate passes instead: • pass 1 – the scout. a tiny "scout query" skims the whole context and scores blocks of tokens. it picks the top-k most relevant blocks. think skimming a table of contents • pass 2 – the real read. the full attention queries only look at the blocks the scout flagged. everything else gets skipped different query groups can focus on different parts of the context at 1 million tokens of context, m3 is way faster than normal attention: • loading and processing a huge prompt (prefilling): 9.7x faster • generating each new token (decoding): 15.6x faster why is decoding even faster? because normally, every time the model spits out a single word, it has to re-read the entire conversation history. that's like flipping through a whole book just to write one sentence. m3's scout already flagged the relevant pages, so it only checks those. massive time saver at 32k tokens, m3 and normal attention are basically the same speed. the scout step adds a tiny bit of overhead, so it only makes sense when the context is really long. this thing is built for giant conversations and agent tasks, not short chats what this actually means: 1. context window is going way up. their previous model m2.7 capped at 200k tokens. m3 is benchmarked at 1m – a 5x jump 2. the way m3 chooses which blocks to read isn't based on fixed rules (like "always skip every other block"). it learns what's relevant on the fly based on what you're asking 3. if quality holds, m3 can serve million-token agentic workloads at near-200k prices. nobody else is touching that follow @thehypedotnews for 24/7 ai news, analysis and breakdowns

English