rw ./

6.7K posts

rw ./ banner
rw ./

rw ./

@gradientintern

push past limits | real niu for @Gradient_HQ | part time troll | full time janitor | professional edging specialist

edge to edge Joined Aralık 2024
588 Following2.3K Followers
Pinned Tweet
rw ./
rw ./@gradientintern·
⬜️⬜️⬜️ ⬜️ ⬜️ ⬜️ ⬜️⬜️ ⬜️⬜️⬜️ ⬜️⬜️⬜️ ⬜️⬜️⬜️ ⬜️ ⬜️ ⬜️ ⬜️ ⬜️ ⬜️ ⬜️⬜️⬜️ ⬜️⬜️⬜️ ⬜️ ⬜️ ⬜️⬜️⬜️ ◻️◻️ ./ training mode on… @Gradient_HQ
English
24
13
85
9.8K
rw ./
rw ./@gradientintern·
@flowerlang20 MiMo V2 definitely a great series of models for working! 🤠
English
0
0
1
32
rw ./
rw ./@gradientintern·
MiMo V2 Omni is Xiaomi’s answer for the real world driven by multimodal and action. MiMo V2 Omni performs exceptional against current counterparts leading in: - MMAU-Pro (Audio): MiMo 69.4 (highest), Gemini Pro 65.0, Flash 67.0 - FutureOmni (Forecast): MiMo 66.7 (highest), Gemini Pro 60.3, Flash 62.9 On par in: - BigBench Audio (Speech): MiMo 94.0, Gemini Pro 91.2, Flash 99.2 (highest) - Video-MME (Video QA): MiMo 85.3, Gemini Pro 88.4 (highest), Flash 76.7 - MMMU-Pro (Multimodal): MiMo 76.8, Claude 73.9, GPT 79.5, Gemini Pro 81.0, Flash 81.2 (highest) - CharXiv RQ (Charts): MiMo 80.1, Claude 77.4, GPT 82.1 (highest), Gemini Pro 81.4, Flash 80.3 Another great alternative by Xiaomi to scale agents in the real world at a more cost efficient level available on Commonstack
rw ./ tweet media
rw ./@gradientintern

MiMo V2 Pro and MiMo V2 Omni is Xiaomi’s latest flagship for agentic and multi modal intelligence. MiMo V2 Pro is Xiaomi’s trillion parameter model with 42B active parameters (almost 3x MiMo V2 Flash) MiMo V2 Pro vs Leading Models General Agentic Capabilities: > PinchBench (avg.): 81.0%; Nearly ties Claude Opus 4.6 (81.5) > ClawEval: 61.5%; Competitive with Claude models (66.3) > GDPVal-AA: 1426; Strong complex tool-use > DeepSearch QA-F1: 86.7%; Competitive with Sonnet 4.6 (89.2) and Opus 4.6 (91.3) > t2-bench: 96.8%; Extremely close to the 98–99 leaders Coding Agentic Capabilities: > SWE-bench Verified: 78.0%; Very competitive with Claude Opus 4.6 (80.8%), GPT-5.2 (80.0%), and Sonnet 4.6 (79.6%) > SWE-bench Multilingual: 71.7% > Terminal-Bench 2.0: 57.1%; Strong production coding performance Xiaomi has also significantly improved on hallucination, V2 Pro has 30% vs V2 Flash of 48% in AA Omniscience. This model sits between GLM5 & Kimi K2.5 on Artificial Analysis Intelligence Index. Incredible work by @XiaomiMiMo 🔥

English
2
9
21
425
Junyang Lin
Junyang Lin@JustinLin610·
this is a huge broccoli 🥦
Junyang Lin tweet media
English
38
6
462
27.9K
rw ./
rw ./@gradientintern·
MiMo V2 Pro and MiMo V2 Omni is Xiaomi’s latest flagship for agentic and multi modal intelligence. MiMo V2 Pro is Xiaomi’s trillion parameter model with 42B active parameters (almost 3x MiMo V2 Flash) MiMo V2 Pro vs Leading Models General Agentic Capabilities: > PinchBench (avg.): 81.0%; Nearly ties Claude Opus 4.6 (81.5) > ClawEval: 61.5%; Competitive with Claude models (66.3) > GDPVal-AA: 1426; Strong complex tool-use > DeepSearch QA-F1: 86.7%; Competitive with Sonnet 4.6 (89.2) and Opus 4.6 (91.3) > t2-bench: 96.8%; Extremely close to the 98–99 leaders Coding Agentic Capabilities: > SWE-bench Verified: 78.0%; Very competitive with Claude Opus 4.6 (80.8%), GPT-5.2 (80.0%), and Sonnet 4.6 (79.6%) > SWE-bench Multilingual: 71.7% > Terminal-Bench 2.0: 57.1%; Strong production coding performance Xiaomi has also significantly improved on hallucination, V2 Pro has 30% vs V2 Flash of 48% in AA Omniscience. This model sits between GLM5 & Kimi K2.5 on Artificial Analysis Intelligence Index. Incredible work by @XiaomiMiMo 🔥
rw ./ tweet media
Čeština
1
3
20
694
Gradient
Gradient@Gradient_HQ·
Dobby is a free elf now. Open models, open orchestration, open compute. The agentic RL stack that used to live inside walled gardens just showed up on hardware you can order and frameworks you can fork. No masters needed.
Andrej Karpathy@karpathy

Thank you Jensen and NVIDIA! She’s a real beauty! I was told I’d be getting a secret gift, with a hint that it requires 20 amps. (So I knew it had to be good). She’ll make for a beautiful, spacious home for my Dobby the House Elf claw, among lots of other tinkering, thank you!!

English
27
26
203
12.9K
rw ./
rw ./@gradientintern·
MiniMax M2.7 just released and it “deeply participated in its own evolution” This is the first model that helped build itself with self evolution with its own optimization loops and RL training. M2.7 vs Leading Models Strong Coding: > SWE Bench Pro: 56.2%, Beats Gemini 3.1 Pro (54.2%); on par with Claude Sonnet 4.6 (57.2%), Opus 4.6 (57.3%), GPT 5.4 (57.7%) > Multi-SWE Bench: 52.7% (leading) Production: > VIBE-Pro: 55.6%; Nearly ties Sonnet 4.6 (56.1%) and Opus 4.6 (55.6%) Strong Agentic Capabilities: > MM-ClawBench (agent/tool use): 62.7%; Competitive with Sonnet 4.6 (64.2%) and Opus 4.6 (75.4%) Also seen significant improvements in ML! This is an incredible release.
rw ./ tweet media
English
2
5
33
1.2K
rw ./
rw ./@gradientintern·
Both of OpenAI’s newest models is supported on day one via @commonstack_ai GPT 5.4 mini is refined for agentic tasks, coding and speed GPT 5.4 nano is optimal for lightweight tasks. Both available now for playground testing or integrated usage.
rw ./ tweet media
rw ./@gradientintern

OpenAI releases GPT 5.4 mini & GPT 5.4 nano targeted in agentic and coding tasks. GPT 5.4 nano xhigh outperforms GPT 5.4 low. Very competitive in pricing. GPT 5.4 mini: > 400k context window > $0.75/1M input & $4.50/1M output GPT 5.4 nano: > $0.20/1M input & $1.25/1M output They serious about focusing their efforts on taking this coding market 👾

English
3
6
34
1.3K
MiniMax (official)
MiniMax (official)@MiniMax_AI·
Introducing MiniMax-M2.7, our first model which deeply participated in its own evolution, with an 88% win-rate vs M2.5 - Production-Ready SWE: With SOTA performance in SWE-Pro (56.22%) and Terminal Bench 2 (57.0%), M2.7 reduced intervention-to-recovery time for online incidents to 3-min on certain occasions. - Advanced Agentic Abilities: Trained for Agent Teams and tool search tool, with 97% skill adherence across 40+ complex skills. M2.7 is on par with Sonnet 4.6 in OpenClaw. - Professional Workspace: SOTA in professional knowledge, supports multi-turn, high-fidelity Office file editing. MiniMax Agent: agent.minimax.io API: platform.minimax.io Token Plan: platform.minimax.io/subscribe/toke…
MiniMax (official) tweet media
English
192
404
3.3K
1.7M
OpenAI
OpenAI@OpenAI·
GPT-5.4 mini is available today in ChatGPT, Codex, and the API. Optimized for coding, computer use, multimodal understanding, and subagents. And it’s 2x faster than GPT-5 mini. openai.com/index/introduc…
OpenAI tweet media
English
539
682
6.2K
1.5M
rw ./
rw ./@gradientintern·
OpenAI releases GPT 5.4 mini & GPT 5.4 nano targeted in agentic and coding tasks. GPT 5.4 nano xhigh outperforms GPT 5.4 low. Very competitive in pricing. GPT 5.4 mini: > 400k context window > $0.75/1M input & $4.50/1M output GPT 5.4 nano: > $0.20/1M input & $1.25/1M output They serious about focusing their efforts on taking this coding market 👾
rw ./ tweet mediarw ./ tweet media
OpenAI@OpenAI

GPT-5.4 mini is available today in ChatGPT, Codex, and the API. Optimized for coding, computer use, multimodal understanding, and subagents. And it’s 2x faster than GPT-5 mini. openai.com/index/introduc…

English
3
3
24
1.3K
rw ./
rw ./@gradientintern·
@OpenAI 400k context window $0.75 per 1M input tokens and $4.50 per 1M output tokens. very competitive
rw ./ tweet media
English
0
0
8
331