Luigi Maselli

4.7K posts

Luigi Maselli banner
Luigi Maselli

Luigi Maselli

@grigi0

techno-geek, webdev, hackerpreneur.

Earth Katılım Kasım 2009
270 Takip Edilen499 Takipçiler
antirez
antirez@antirez·
Look at this. Also opencode uses freaking 11k tokens of system prompt. Even at decent pre-fill of ~130 t/s it means waiting 84 seconds to start a session. What's the point? :D The pi agent is a lot saner here. Moreover, one could say, let's cache on disk very long common KV cache chunks, no? Hash it with all the parameters and put a sensible TTL if not used. But also: only cache it if you see it repeated N times across different sessions.
antirez tweet media
English
31
12
346
44.8K
Luigi Maselli
Luigi Maselli@grigi0·
Tested Qwen3.6-35B-A3B-APEX-I-Compact, almost as good as Qwen3.6-35B-A3B-UD-Q4_K_M but it takes 4Gb less. //cc @mudler_it @lu_zero_
Luigi Maselli tweet mediaLuigi Maselli tweet media
English
1
0
2
134
Luigi Maselli
Luigi Maselli@grigi0·
Qwen3.6 35b a3b perform worse than Qwen3.5 in my benchmark, but it should be something related to the quantization or tool calling //cc @UnslothAI gemma4 26b a4b still better for me #localai #llm
Luigi Maselli tweet media
English
1
0
0
150
Luigi Maselli
Luigi Maselli@grigi0·
Are Local LLMs good enough for Vibe Coding? Gemma4-26B-A4B vs Qwen3.5-35B-A3B
Luigi Maselli tweet media
English
1
0
0
100
Luigi Maselli
Luigi Maselli@grigi0·
nothink is the best tradeoff between accuracy and speed
English
0
0
0
20
Luigi Maselli
Luigi Maselli@grigi0·
Gemma 4 26B A4B is currently the best tradeoff in agentic benchmark for GPU poor people, specially the nothink version
Luigi Maselli tweet media
English
2
0
1
61
Luigi Maselli
Luigi Maselli@grigi0·
Currently qwen-3.6-plus-free is the best model you can run on #opencode for quality and speed.
Luigi Maselli tweet media
English
1
0
1
117
Luca Barbato
Luca Barbato@lu_zero_·
@grigi0 io sono ancora felice con openrc e questo mi ricorda che c'e` la nostra implementazione rust di mdev da aggiornare e provare un po' di piu`.
Italiano
1
0
0
36