Mateusz Mirkowski

3.8K posts

Mateusz Mirkowski banner
Mateusz Mirkowski

Mateusz Mirkowski

@llmdevguy

Autonomous agents, agentic engineering Building & testing agentic systems Exploring local LLMs

Remote work evangelist Inscrit le Mart 2013
149 Abonnements1.8K Abonnés
Jun Song
Jun Song@jun_song·
Fine-tuned Super-Chaton Fat : - 500T with 1B MoE - 1B context - 104.2% on SWE bench - 98.2% on cute cat bench
Alexander Knigge@AlexanderKnigge

oh my god its happening @MistralAI has officially confirmed the upcoming release of Le Chaton Fat - 30T MoE with 256 experts - 1M context window - multimodal and multilingual - outperforms Fable 5 on every benchmark

English
16
8
169
18.5K
kiyoe
kiyoe@kiyoe_333·
@llmdevguy K2.7 seems to be around Sonnet 4.6 level. GLM 5.2 (max) might be somewhere between Opus 4.6 and 4.7. That said, GLM 5.2’s token consumption is seriously brutal.
English
1
0
0
131
Mateusz Mirkowski
Mateusz Mirkowski@llmdevguy·
🔥GLM 5.2 vs Kimi K2.7. Which one is better? Will test it soon. What's your thoughts?
English
64
2
374
54.1K
ZCode
ZCode@zcode_ai·
GLM-5.2 is now fully available for GLM Coding Plan users. ZCode 3.0 is deeply optimized for GLM-5.2, bringing stronger Agent task execution, better long-context coding, and the new Goal feature for managing larger development objectives from planning to completion. Coding Plan subscribers get 150% usage quota inside ZCode. New users get 5 days free with 5M tokens per day. Download: zcode.z.ai
ZCode tweet media
English
87
127
1.5K
151.3K
thepeche
thepeche@maciekkukuczka·
@llmdevguy No właśnie nie widzę jeszcze Glma 5.2 ani w openrouter, ani w zen ani w go. W każdym razie jak potestujesz to napisz opinię. Dzięki!
Polski
1
0
0
16
Mateusz Mirkowski
Mateusz Mirkowski@llmdevguy·
@maciekkukuczka Na kimi można bardziej polegać. Glm czasami bywa cholernie wolny. Jednak nowych wersji jeszcze nie testowałem. Polecam wziąć opencode go za 5 dolarów i się pobawić.
Polski
1
0
1
36
thepeche
thepeche@maciekkukuczka·
@llmdevguy Ok, a który z dwójki GLM/KIMI przy planie za okołostówkowym?
Polski
1
0
0
12
Mateusz Mirkowski
Mateusz Mirkowski@llmdevguy·
@maciekkukuczka Niestety m3 trochę nie dowozi. Poziom niżej od glm i kimi. Jednak dalej jest spoko do prostszych czy prywatnych zastosowań.
Polski
1
0
1
217
thepeche
thepeche@maciekkukuczka·
@llmdevguy To samo mi chodzi po głowie... No i jak się przy nich miewa MiniMax M3? Jakieś doświadczenia w .net (blazor)?
Polski
1
0
1
256
stevibe
stevibe@stevibe·
Tested DiffusionGemma 26B A4B vs Gemma4 26B A4B (BenchLocal, 4 bench packs): Diffusion | Gemma4 > ToolCall-15: 83 | 97 > BugFind-15: 92 | 82 > DataExtract-15: 76 | 83 > ReasonMath-15: 50 | 89 Gemma4 leads overall, the ReasonMath gap is steep. But DiffusionGemma edged it out on BugFind-15, which surprised me. Diffusion text quality looks rougher right now, but it's still an experiment. Curious to see where it lands long-term.
stevibe tweet mediastevibe tweet mediastevibe tweet mediastevibe tweet media
English
7
2
69
5.1K
redmonkey
redmonkey@redmonkeyAI·
@llmdevguy Add minimax M3 to the test as well please
English
1
0
0
1.1K
Saad
Saad@eSaadster·
@llmdevguy glm-5.2 > kimi-k2.7 > glm-5.1
Türkçe
2
0
30
4.1K
Mateusz Mirkowski
Mateusz Mirkowski@llmdevguy·
@arunoda True. I use 5.5 mostly. Kimi and glm are good enough in most cases but not in very complex tasks.
English
0
0
4
2.8K
Arunoda Susiripala
Arunoda Susiripala@arunoda·
@llmdevguy I tried some really hard tasks yesterday with both. Still they are not Opus or GPT 5.5 level. Working with complex tasks involving backend, client & UI stuff they are not there yet. But for surgical tasks, it’s good. Both are better than their previous versions.
English
1
0
9
4.7K
Daniel Petro
Daniel Petro@DanielPetroAI·
@llmdevguy I've only tried kimi so far for code review and it worked pretty well!
English
1
0
5
3.6K
Steve💙🇨🇦
I hate that I am starting to like AI generated music better than other modern music. There are some telltale signs music is AI generated, but it's subtle. My sister is big into making AI music also, so I've been forced to listen to quite a bunch this year. Make it on a 3090.
English
8
0
12
555
stevibe
stevibe@stevibe·
My first reaction: How is that possible? Running DiffusionGemma 26B A4B NVFP4 on my DGX Spark at 161.9 tok/s!
English
35
22
521
40K
Unsloth AI
Unsloth AI@UnslothAI·
Google releases DiffusionGemma.✨ The new 26B-A4B diffusion text model runs locally on 18GB RAM. It supports high-speed text generation, thinking, image, video and 256K context. Run and train via Unsloth Studio. GGUF: huggingface.co/unsloth/diffus… Guide: unsloth.ai/docs/models/di…
Unsloth AI tweet media
Google Gemma@googlegemma

Meet DiffusionGemma! An experimental open model that explores a fast approach to text generation, released under an Apache 2.0 license. Moving beyond sequential, token-by-token processes to generate entire blocks of text simultaneously. Here’s what’s new with DiffusionGemma: 👇

English
65
247
1.9K
323K