Mateusz Mirkowski

3.8K posts

Mateusz Mirkowski banner
Mateusz Mirkowski

Mateusz Mirkowski

@llmdevguy

Autonomous agents, agentic engineering Building & testing agentic systems Exploring local LLMs

Remote work evangelist Katılım Mart 2013
150 Takip Edilen1.8K Takipçiler
MiniMax (official)
MiniMax (official)@MiniMax_AI·
Introducing MiniMax M3: The First Open-Weights Model to Combine Three Frontier Capabilities - Coding & Agentic Frontier: 59.0% SWE-Bench Pro, 66.0% Terminal Bench 2.1, 34.8% SWE-fficiency, 28.8% KernelBench Hard, 74.2% MCP Atlas - MiniMax Sparse Attention scales context to 1M - Natively Multimodal from Step Zero API: platform.minimax.io Token Plan: platform.minimax.io/subscribe/toke… 🚀New! MiniMax Code: code.minimax.io Weights & Tech Report in ~10 Days
MiniMax (official) tweet media
English
272
575
4.1K
584.9K
Mateusz Mirkowski
Mateusz Mirkowski@llmdevguy·
Playing with new TV(middle one). My wife loves the new bedroom design for the next two weeks - 10/10. My wife happiness - 10/10 My happiness - 10/10. Only one score is true. Guess which one. 😂 BTW new soundbar is coming on Monday. 🙈
Mateusz Mirkowski tweet media
English
1
0
4
535
Shawn
Shawn@tigerwhoTT·
@llmdevguy Bro, I don’t like anthropic, their way to treat individuals sucks, but you have not choice if you’re not a coder, for coding, just throw Claude out, codex is good enough
English
1
0
1
373
Mateusz Mirkowski
Mateusz Mirkowski@llmdevguy·
Did Opus 4.8 beat GPT-5.5? Please tell me because I don't use anything from Anthropic.
English
116
0
365
105.2K
Ignacio Giagante
Ignacio Giagante@nachitogiagante·
@llmdevguy I've literally abandoned Anthropic. Everytime that I've tried to back to Opus, its a disaster. Then, i have to use code to fix all the findings.
English
1
0
1
304
Sanarsh
Sanarsh@sanarsh11·
@llmdevguy Opus 4.8 beat GPT-5.5... at burning through my tokens faster.
English
1
0
41
3.7K
Kiril Videlov
Kiril Videlov@krlvi·
@llmdevguy i prefer gpt 5.5 over opus 4.7. however, after some testing over the past 2-3 hours i find 4.8 to outperform 5.5 in most of my tasks.
English
7
0
59
9.4K
Loktar 🇺🇸
Loktar 🇺🇸@loktar00·
My daily hermes tasks running on deepseek all month, figured I'd check if I needed to top up... maybe in another 6 months 😂 For me it's ticking all the boxes, cheap, fast, good.
Loktar 🇺🇸 tweet media
English
9
1
46
3K
Youssof Altoukhi
Youssof Altoukhi@Youssofal_·
After spending time with Qwen 3.6 27B in Cursor I’ve come to realise the constraint on local models isn’t intelligence but the harnesses. Local model harnesses are TERRIBLE. Pi, open code etc are genuinely bad. As a community, we need to do better than this.
English
185
21
756
81K