Max Headroom

20.3K posts

Max Headroom

@CosmicMonad

Project Mayhem: Operation Unity Russian from УССР. Vincit omnia veritas.

Bergabung Nisan 2024

431 Mengikuti456 Pengikut

Tweet Disematkan

Max Headroom@CosmicMonad·1 May

ZXX

5.8K

Max Headroom@CosmicMonad·5h

@loktar00 MoE's are always more hare brained than dense models. This is what 27b looks like when researching with similar prompt:

English

189

Loktar 🇺🇸@loktar00·6h

Hmmm little concerned with 3.6 35b... it calls tools well but sucks at research and relies on it's internal knowledge instead of searching hard. 3 out of 5 were good.. but these 2 were terrible. Asked it to research what happened in the local llm world today. > Here's what happened in the local LLM / open-weight space for April 16, 2026: 2. Meta Llama 4 Scout & Maverick — First native multimodal MoE 3. OpenAI Codex open-weight + Unsloth Studio launch OpenAI released Codex open-weight models (20B and 120B variants) available on Ollama under "tools thinking cloud"

English

2.4K

Max Headroom@CosmicMonad·5h

@LottoLabs What automation harness/framework are you using here?

English

155

Lotto@LottoLabs·7h

Let’s see if qwen 35b can not fumble this Big ask, but mainly a long task to check tool call reliability than quality

English

3.5K

Max Headroom@CosmicMonad·7h

Installed sglang 0.5.10rc0 with the mamba memory leak fix. Testing qwen3.5-27b-fp8 now, seems snappier and so far memory is doing better!

English

Max Headroom@CosmicMonad·10h

I dunno about all them milk crates, but at least a couple everyone should be able to pull off. This is about to get ridiculous.

Ahmad@TheAhmadOsman

Whatʼs stopping you from becoming a chad like Gilfoyle and building your own servers instead of relying on the cloud? The PATH to becoming a GREAT engineer starts this way (the path to saving a lot of $$$ also starts this way)

English

Max Headroom@CosmicMonad·10h

@PierceZhang34 @NousResearch It can be made to support it, kinda. Hermes itself can do it :)

English

omegaAI@PierceZhang34·11h

@NousResearch Do you have plan to support searxng?

English

472

Nous Research@NousResearch·15h

Tool Gateway is now live in Nous Portal. No separate accounts, no API key juggling. All you need is one subscription, and everything works. A paid Nous Portal subscription now includes access to 300+ models and a growing set of third-party tools. Launching with: → Web scraping → Browser automation → Image generation → Cloud terminal backend → Text-to-speech

English

180

171

1.7K

397.3K

Max Headroom@CosmicMonad·10h

@NousResearch Will there be an easy setup mode that avoids cloud everything, and sets up things locally, in docker containers? If anyone is interested, I have a docker-compose.yml that sets this up locally, so far w/o image gen and STT, that's in the works.

English

127

Max Headroom@CosmicMonad·12h

@LenSeaside @LottoLabs So I did some digging instead.

English

Len Seaside@LenSeaside·13h

@CosmicMonad @LottoLabs I was trying to let you know that there were known tool calling problems with Gemma 4 until the release of a new Jinja a few days ago. You seem to have come here for an argument...

English

Lotto@LottoLabs·1d

This seems like it might be the best local model of the Gemma family Maybe better than qwen 27b? huggingface.co/Jiunsong/Super…

English

310

15.8K

Max Headroom@CosmicMonad·12h

@LenSeaside @LottoLabs Do you have benchmarks after jinga fix to show it can reliably call tools? I haven't seen anything that isn't from google itself.

English

Max Headroom@CosmicMonad·14h

@ioannadenisova4 Таблеточку этодругина вам, тут все говорят на своих языках, а не на одном :ь

Русский

Рыжая 3,14сечка@ioannadenisova4·4d

Кстати, господа вавилоняне, напоминаю что во всех писаниях было указано, что перед приходом антихриста и концом всего живого, все люди снова начнут говорить на одном языке.

GIF

Русский

597

922

14.9K

351.3K

Max Headroom@CosmicMonad·15h

@theisraelguys According to one interpretation of a couple of religions out of tens of thousands. Quit using your faiths to justify violence against one another.

English

The Israel Guys@theisraelguys·23h

This might sound controversial.... but God is with Israel

English

207

329

7.2K

Max Headroom@CosmicMonad·15h

@loktar00 @SIGKITTEN 🤣😂😅

QME

Loktar 🇺🇸@loktar00·19h

@SIGKITTEN 🤣

QME

198

SIGKITTEN@SIGKITTEN·20h

omg they nerfed opus 4.7

English

329

19K

Max Headroom@CosmicMonad·15h

@sudoingX 46tk/s, 1700 prefill. Qwen3.5-27b-fp8. 3090x2, 9950x, x870e, 64gb@6000.

Indonesia

Sudo su@sudoingX·18h

if you own any gpu and you're running local models drop your tok/s, quant, flags, and gpu below. nvidia, amd, laptop, desktop, doesn't matter. every config you share saves someone else 3 hours of head scratching. i'll amplify the best ones and add them to the community benchmark sheet. this is how we build the local ai knowledge base, together

English

1.6K

Sudo su@sudoingX·18h

180 tok/s generation on a 4090 with qwen 3.6. if you're on a 4090 and not running this model yet you're leaving performance on the table. 3B active params at that speed is insane for agentic coding. thanks for the data @ErdalToprak, adding this to the comparison sheet

Erdal@ErdalToprak

- model: Qwen3.6-35B-A3B-UD-IQ4_XS.gguf - GPU: RTX 4090 - CUDA, f16 KV, flash attention on - n_gpu_layers=999, threads=8, batch=256, ubatch=256 - Prompt-only, 512 tokens: about 4995 tok/s - Generation-only, 128 tokens: about 180 tok/s - Mixed, 4096 prompt + 128 gen: about 2700 tok/s effective combined throughput - 512,0: 4976.8 to 4994.8 tok/s - 0,128: 179.36 to 179.95 tok/s - 4096,128: 2700.06 tok/s x.com/ErdalToprak/st…

English

209

20.4K

Max Headroom@CosmicMonad·15h

@lucaa_wav @mull3r_1 It's only incredible because of a massive dehumanization and deception campaign waged by various NGOs against Russians. Literally hundreds of billions were poured into conditioning people to think Russians are not human. Barely makes sense in modern world.

English

102

Luca@lucaa_wav·17h

@mull3r_1 Incrível como os comentários mais sensatos sao dos russos

Português

1.5K

Fábio 🇾🇪@mull3r_1·18h

Com essa treta toda, descobri que os russos são legais

Português

119

113

46K

Max Headroom@CosmicMonad·16h

@GabeZZOZZ What an evil duplicitous ass.

English

Gabriel@GabeZZOZZ·1d

Right after Zelensky announced that all military-age Ukrainian men must return to Ukraine for mobilization 🤣

English

141

1.3K

26.2K

Max Headroom@CosmicMonad·17h

@droidbuilds diff is 8

English

DROID@droidbuilds·21h

most people get this wrong what’s the difference between 100 MB/s and 100 Mb/s?

English

955

179

10.7K

1.4M

Max Headroom@CosmicMonad·17h

@LenSeaside @LottoLabs I am. You don't understand how tool calling works, are you sure you want to keep adding to this discussion?

English

Len Seaside@LenSeaside·18h

@CosmicMonad @LottoLabs I assumed you would be trying to get the best out of it.

English

Max Headroom@CosmicMonad·20h

@x0ptimal @bnjmn_marie I haven't really used the MoE qwen3.5 offerings, because the dense works so well. Glad they are also functional!

English

++0ptim4l@x0ptimal·20h

@CosmicMonad @bnjmn_marie Can agree from experience. Its straight dog water when it comes to tool calling. Been running qwen3.5 35b and it just works perfectly 👌

English

Benjamin Marie@bnjmn_marie·1d

I published my full analysis comparing Gemma 4 31B vs Qwen3.5 27B. > Best accuracy: Gemma 4 31B > Best token efficiency: Gemma 4 31B > Best raw inference throughput: Qwen3.5 27B > Best memory footprint: Qwen3.5 27B > Best end-to-end latency: task-dependent, with Gemma 4 slightly ahead on harder tasks and Qwen3.5 ahead on simpler ones > Best “fast” mode: Gemma 4 31B with thinking disabled > Best generalization / least benchmark affinity: Gemma 4 31B All the results and data here: kaitchup.substack.com/p/gemma-4-31b-…

English

278

18.4K

Max Headroom me-retweet

McNasty@McNasty·1d

its crazy how nobody talks about the epstein files anymore

English

446

11.7K

93.6K

837.1K

Max Headroom@CosmicMonad·20h

@venusanbacchus Why does this graph only go to 130?

English

cokelie kirkamine@venusanbacchus·1d

i got an iq test done and then i saw a graph like this a few months later and stopped caring

creekseeker@mudscryer

Just proved that IQ is an infohazard. YOURE WELCOME

English

3.7K

334.1K

Max Headroom@CosmicMonad·20h

@Logically_JC Every 4 years literally half the country stands against the current president, and gets its way roughly 50% of the time. Not that it matters to the epstein class, they run both candidates who answer to them.

English

John Collins@Logically_JC·1d

Yes, you can.

English

1.4K

1.6K

17.2K

198.5K

Jelajahi

@loktar00 @LottoLabs @PierceZhang34 @NousResearch @LenSeaside @ioannadenisova4 @theisraelguys @SIGKITTEN