Milan

72 posts

Milan

@cosmic_spec

เข้าร่วม Ağustos 2021

142 กำลังติดตาม1 ผู้ติดตาม

Milan@cosmic_spec·24 Mar

@LottoLabs @ThePrimeagen Comparable to say haiku 4.5?

English

177

Lotto@LottoLabs·24 Mar

@ThePrimeagen 27b is the best model release of the year so far

English

16.4K

ThePrimeagen@ThePrimeagen·24 Mar

ZXX

147

334

7.8K

1.3M

Milan@cosmic_spec·24 Mar

@0xSero Everyone's talking about q3.5 27b model, if you have used to enough, what you can compare it to, haiku 4.5?

English

0xSero@0xSero·24 Mar

Look at these and read them carefully, this will give you an idea of what you can expect from different Open Weight models. Qwen3.5-27B is the smartest Qwen3.5-35B-3A is the fastest huggingface.co/Qwen/Qwen3.5-2… huggingface.co/Qwen/Qwen3.5-3… Any Laptop, Macbook, or GPU should handle em

0xSero@0xSero

People are not lying when they say Qwen3.5-27B is incredibly capable. 1. Bubble size = total params - World Knowledge, Languages, Skills 2. X axis = active params - Raw Intelligence per token 3. Y axis = tokens/s - Speed of prefill and generation (decode) GLM-5 | 744B params | 40B active Kimi-K2.5 | 1T params | 32B active Qwen3.5-27B | 27B active params Qwen3.5-Plus | 397B params | 17B active MiniMax-M2.7 | 229B params | 10B active MoEs can store much more world knowledge, and breadth of information. For a Mixture-of-Expert, you can stack it up to 1T params, so you can give it 20 Trillion tokens or more of training data, it learns more. But during runtime, only a small portion of that gets activated. Taking MiniMax-M2.5 as an example: Only 10B are active at a time, so while you use it you get the speed and closer intelligence to nemotron-8B it's just MiniMax-M2.5 can know much more, and thus perform better.

English

102

1.2K

72.4K

Milan@cosmic_spec·17 Mar

@dotta paperclip doesnt support hermes agent though right

English

dotta 📎@dotta·7 Mar

Paperclip and Hermes Agents for Biotech Research

SWARM AI Research@ResearchSwarmAI

Created a team of @NousResearch Hermes Agents for Biotech Research using paperclip.ing as an orchestrator prior work and inspiration - mdpi.com/2409-9279/9/2/…

English

10.7K

Milan@cosmic_spec·14 Mar

@LottoLabs Qwen 27b with Hermes agent, how good at a high level? Usable?

English

Lotto@LottoLabs·13 Mar

Qwen 3.5 27b and Hermes Agent ran through and a/b tested some architecture changes in my small tinygrad gpt model. I let it make the decisions fully and it ran to completion with minor steering. Haven’t looked at the code yet, I am skeptical but if it runs on CPU it’ll run anywhere. Obvious that larger params would have better loss/val. Interesting times.

English

3.9K

Milan@cosmic_spec·7 Mar

@HarveenChadha Better name it, SharmajiKaLadkaGPT

English

237

Harveen Singh Chadha@HarveenChadha·6 Mar

we tested Sarvam 105B on the JEE Mains 2026 paper conducted on 28 January 2026 it scored 70/75 on pass@1 and 75/75 on pass@2

English

260

2.6K

48.2K

Milan@cosmic_spec·2 Mar

@HarveenChadha Way low, the memory bandwidth

English

Harveen Singh Chadha@HarveenChadha·2 Mar

Should I ??

English

7.5K

Milan@cosmic_spec·2 Mar

@ivanfioravanti Any eta on m5 ultra studio?

Italiano

Ivan Fioravanti ᯅ@ivanfioravanti·1 Mar

I hope M5 Ultra will be good enough to meet the hype around it 👀

English

118

6.6K

Milan@cosmic_spec·21 Şub

@HarveenChadha Congratulations sardarji, very happy for you and sarvam.

English

343

Harveen Singh Chadha@HarveenChadha·21 Şub

10 months back parents were not happy when I left MS Today when I reached, they were smiling Dad showed me all the news channel recordings, newspapers mentions of sarvam Mom told me how she promoted sarvam in whatsapp groups and to neighbours Overall, a very small win but a long way to go

English

265

3.4K

70.8K

Milan@cosmic_spec·19 Şub

@DorianDevelops Kimi 2.5

Türkçe

108

Dorian Develops@DorianDevelops·19 Şub

MiniMax M2.5 or Kimi K2.5?

121

180

35.8K

Milan@cosmic_spec·19 Şub

@ashen_one 512 M5 ultra would be able to run minimax and glm5 quant, so one should go and buy 512 memory for sure

English

ashen@ashen_one·18 Şub

Should you really buy a Mac Studio unless it's the maxed-out 512 GB version? I have been talking to QWEN, a local LLM on this $4,000 98gb Mac Studio for only 2 hours, and I hate him. He's so stupid. He's actually so slow and so dumb that my main open claw, that's running Opus 4.6, yells at him in front of me Like, even to get a Discord message, it's taking a minute plus Unless you have 512GB of RAM on that big fat Mac Studio, I don't think running local LLMs is a good idea. I think it's worth it to just eat the cost of any API that you're using and be able to switch to the newest APIs like Minimax or KIMIK, and not limit yourself to wasting hardware Even though I have this 98 GB Mac Studio that costs $4,000, the local LLMs that are allowed to run on it are so dumb and slow that it makes more sense to just pay $100/month to access whatever Chinese model is better through their API If you really want to experiment and learn and have fun, you should only really do that with the 512 GB Mac Studio so that you can run actually good and smart models compared to these slow and stupid ones it be like that. I'm still having fun, and it was fun setting it up But this is just me publicly learning less I usually do on this account

English

142

341

44.8K

Milan@cosmic_spec·18 Şub

@Dating_Dynamics Its a simple fact, coz of olfactory adaptation

English

Dating Dynamics@Dating_Dynamics·17 Şub

A sex therapist with 30 years experience revealed the real reasons so many couples go from passionate to platonic in under 2 years. It's not stress, kids, or age. It's these seven dynamics shifts…

English

121

428

6.5K

5.1M

Milan@cosmic_spec·15 Şub

@ivanfioravanti @Apple Any date we have as yet, even credible rumors?

English

176

Ivan Fioravanti ᯅ@ivanfioravanti·15 Şub

As soon as M5 Max and Ultra will be released I’ll buy 1 + 2. So please @Apple release them before end of March to boost your Q1 earnings 😎

English

5.9K

Milan@cosmic_spec·14 Şub

@applesclubs Memory bandwidth?

English

193

Apple Club@ApplesClubs·14 Şub

Apple’s next powerhouse is loading… 💻⚡️ New Mac Studio could arrive after early March 2026 with: • M5 Max & M5 Ultra options • Faster SSD speeds • Same compact “squircle” design • Starting around $1,999 No redesign — just pure performance upgrades. Are you waiting for M5 Ultra? 👀🔥

English

791

85K

Milan@cosmic_spec·14 Şub

@stolinski Most likely in june

English

125

Scott Tolinski - Syntax.fm@stolinski·13 Şub

This Mac Studio M5 Ultra needs to drop soon.

English

231

19.4K

Milan@cosmic_spec·14 Şub

@simicvm @Prince_Canuma Compared this asr with whisper, parakeet etc?

English

181

Marko Simic@simicvm·13 Şub

fully local Wispr alternative app running Qwen3 ASR 0.6B via mlx-audio-swift. practically instantaneous result in any active input field. @Prince_Canuma

English

121

11K

Milan@cosmic_spec·12 Şub

@test_tm7873 Benchmaxxed obviously

English

testtm@test_tm7873·11 Şub

?!?! Wtf. That is Strong! Small model. Or benchmaxxed.

Ivan Fioravanti ᯅ@ivanfioravanti

How can a 3B parameters model reach this quality? 👀

English

4.6K

Milan@cosmic_spec·12 Şub

@TheAhmadOsman How much memory needed to run this at say q4 with full context,

English

156

Ahmad@TheAhmadOsman·12 Şub

we have opensource Opus 4.5 at home now Zhipu AI cooked with GLM-5

English

363

11.9K

Milan@cosmic_spec·9 Şub

@TheAhmadOsman Rtx 6000 pro vs 2x dgx spark, what would you suggest, equal price, run smaller models faster vs big models slow?

English

Ahmad@TheAhmadOsman·26 Oca

Prediction We will have Claude Code + Opus 4.5 quality (not nerfed) models running locally at home on a single RTX PRO 6000 before the end of the year

English

184

1.3K

309.2K

Milan@cosmic_spec·7 Şub

@iotcoi In our experience how good really is the model? Close to sonnet 4.5?

English

409

Mitko Vasilev@iotcoi·6 Şub

I just woke up Claude Code Agent Swarm on local Qwen3 Coder Next. No cloud. No Internet. No quota anxiety. No 'You've hit your limit, resets 10 pm' One GB10 GPU - 100 tokens/sec generation - 17,871 tokens/sec read top speed - 256k context window - Swarm tool calling just works