d4mations

14.9K posts

d4mations

d4mations

@d4mations

España Katılım Eylül 2009
427 Takip Edilen241 Takipçiler
d4mations
d4mations@d4mations·
@ivanfioravanti @Apple Not so sure 4bit is really useful day to day. For benchmaxxing ok but for real world use, I would prefer 8 or worst case 6bit
English
1
0
3
143
Ivan Fioravanti ᯅ
Ivan Fioravanti ᯅ@ivanfioravanti·
1/3 MLX Context Benchmark of Qwen3.5-27B-4bit on M5 Max 128GB. Strong model and good speed overall! @Apple M5 Ultra will be a beast!
Ivan Fioravanti ᯅ tweet media
English
11
5
65
3.7K
d4mations
d4mations@d4mations·
@JamesNumb3rs @_weiping Hhhmmmm I’m on mlx_lm on mac with an mlx 8bit quant and opencode desktop. Wonder if there is something in that stack that’s not helping
English
0
0
0
25
James
James@JamesNumb3rs·
@d4mations @_weiping worked fine for me opencode llama.cpp amd 9070 16gb + spillover ddr5. went to plan mode. asked it to build an amiga demo, rotating cube, painters' algorhythm. then build mode. did it. 50tks write 400tks prompt processing
English
1
0
0
35
Wei Ping
Wei Ping@_weiping·
🚀 Introducing Nemotron-Cascade 2 🚀 Just 3 months after Nemotron-Cascade 1, we’re releasing Nemotron-Cascade 2: an open 30B MoE with 3B active parameters, delivering best-in-class reasoning and strong agentic capabilities. 🥇 Gold Medal-level performance on IMO 2025, IOI 2025, and ICPC World Finals 2025: • Capabilities once thought achievable only by frontier proprietary models (e.g. Gemini Deep Think) or frontier-scale open models (i.e. DeepSeek-V3.2-Speciale-671B-A37B). • Remarkably high intelligence density with 20× fewer parameters. 🏆 Best-in-class across math, code reasoning, alignment, and instruction following: • Outperforms the latest Qwen3.5-35B-A3B (2026-02-24) and even larger Qwen3.5-122B-A10B (2026-03-11). 🧠 Powered by Cascade RL + multi-domain on-policy distillation: • Significantly expand Cascade RL across a much broader range of reasoning and agentic domains than Nemotron-Cascade 1, while distilling from the strongest intermediate teacher models throughout training to recover regressions and sustain gains. 🤗 Model + SFT + RL data: 👉 huggingface.co/collections/nv… 📄 Technical report: 👉 research.nvidia.com/labs/nemotron/…
Wei Ping tweet media
English
39
132
838
116.8K
d4mations
d4mations@d4mations·
@daniel_mac8 Really good performance but tool calling is broken in opencode. Cascade 2 is completely unusable
English
0
0
1
115
d4mations
d4mations@d4mations·
@iMilnb I get 50+ on an m1 64gb at 8bit. It’s just not that great of a model. It gets stuck in a tool calling loop that it can’t get out of
English
0
0
1
231
d4mations
d4mations@d4mations·
@gregogallagher This isn’t even a perception debate, it’s a matter of the composition of both peps. Tirz will have more suppression for the same dose as reta as it contains more glp1 than reta.
English
0
0
0
145
Greg O'Gallagher
Greg O'Gallagher@gregogallagher·
Some claim Reta suppresses appetite better. Others claim the exact opposite, Tirzep suppresses hunger better. Which is it?
English
45
2
70
15.3K
Michael Mindrum, MD
Michael Mindrum, MD@MichaelMindrum·
I am moving from aol to yahoo for email. Seems like a much better model.
English
2
0
3
566
Jonathan Rudderham
Jonathan Rudderham@codeRunnerUK·
@TommyFalkowski I found it to be hit-and-miss on my long context coherence test. Now that I've started testing non-Qwen3.5 models, it's showing how good Qwen3.5 is, but it's definitely a "your mileage may vary" model at 35B-A3B long context.
English
1
0
1
68
Tommy Falkowski
Tommy Falkowski@TommyFalkowski·
qwen3.5 35B-3A can actually do stuff. It's so weird and awesome to see a local model do useful things. Even with a pretty full context window it still seems usable.
English
7
3
66
6.4K
💯Mike Appy💯
💯Mike Appy💯@DigitalFringed1·
After about 6 weeks back on a 100mg TRT maintenance dose following my first tiny cycle, I feel perfect, exactly like my old self. Less variables & pinning is nice as well. I'm staying here long-term. The only addition I'll make is MT2, that stuff just works.
English
16
1
94
8K
Saint Gabriel ✝️
Saint Gabriel ✝️@Gabri3lTheGreat·
has anyone actually stacked tirz & reta before
English
26
0
15
16.1K
d4mations
d4mations@d4mations·
@peer_rich I’m looking for a peptide that’ll take that urge away!!!! I’m desperate!!
English
0
0
0
53
Peer Richelsen
Peer Richelsen@peer_rich·
the male urge to build a GPU cluster at home
English
82
238
2K
77.4K
d4mations
d4mations@d4mations·
@Clark10x Haha this the first polymarket post I believe!!!
English
0
0
0
284
Clark 🆓
Clark 🆓@Clark10x·
I've spent about 3 weeks and over $2,000 in Claude tokens setting up multiple agents to trade on Polymarket It was paper trading for a week and averaged about $80 per day It's been live for 2 days & completely broken All of these other people gotta lying on the timeline 😂😂
Clark 🆓 tweet media
English
493
96
4.5K
396.1K
Paweł Z
Paweł Z@Pawzgm·
@wildmindai GLM is trash. Get a second-hand 3090 24GB for $1,000 and run Qwen3.5 27B, which is almost the same as GPT 5.2 and a bit worse than Opus.
English
2
1
20
2.8K
Wildminder
Wildminder@wildmindai·
Hot! you don’t need a $5k GPU to run a 30B model dev drops a custom Linux module, claims GLM-4.7-Flash model run smoothly on RTX 5070 with 12GB VRAM - normal PC RAM pretends to be VRAM - bypasses the CPU completely - inference speeds actually stay fast But... wait, don't we already have offloading? gitlab.com/IsolatedOctopi…
Wildminder tweet media
English
13
20
328
32.1K
Sergizz
Sergizz@Sergizzzz4·
@LottoLabs What model of qwen 3.5 27b can I run on my Macbook M1 Max 64GB properly? Thank you ser
English
4
0
8
1.7K
Lotto
Lotto@LottoLabs·
Essentials/ 3090 > qwen 3.5 27b > hermes agent > tailscale
Català
22
13
395
40.2K
d4mations
d4mations@d4mations·
@sudoingX Did the migration last night but encountered bug with custom/local llm that would not use the correct api key
English
1
0
1
808
Sudo su
Sudo su@sudoingX·
i get this question a lot recently so let me be clear. hermes has 11 model specific tool call parsers built in. it knows how qwen, deepseek, llama, mistral format their tool calls natively. openclaw doesn't parse any of that. it expects the API to handle it. that's why my 9B on a 3060 never drops a tool call through hermes. every single tool call lands clean because the parser understands the model's format not just the API spec. the openclaw founder joined openai in february and the project got handed to a foundation. hermes is built and maintained by nousresearch, an independent research lab that ships open source models and tools for the community. you decide who you want holding your agent framework. if you're on openclaw right now hermes already has a migration script built in. i'll help anyone who wants to make the switch.
AGI Mind@agi_mind24

@sudoingX Why Hermes and not just Openclaw? What's the benefits

English
51
43
631
98.8K
d4mations
d4mations@d4mations·
@Teknium I think I found a bug in how hermes handles local/custom providers. Seems not to get the api key right or doesn’t send the right api key. Will investigate a bit more and send a PR
English
0
0
1
27
Samuel Fajreldines
Samuel Fajreldines@devindolar·
@nix_eth Have you tried Qwen3.5-27B dense? I've bought the same Mac. So excited to receive it! What system are you using? oMLX?
English
4
0
4
1.3K
nix.eth
nix.eth@nix_eth·
LLM speed on my MacBook M5 Max (128GB): • Qwen3.5-35B-A3B (Q6): 74 tok/s • Nemotron-3 Super (Q4): 24 tok/s • Qwen3-Coder-Next (6-bit): 67 tok/s • Llama 3.3 8B Instruct (Q4): 99 tok/s On my M1, Llama 3.3 was my go-to for most local tasks at about 20 tok/s. On the M5 Max, it's hitting 99 tok/s. Qwen3.5 feels like a huge upgrade and is my favorite so far. Qwen3-Coder-Next is surprisingly good at dev tasks, although I'll probably stick with GPT-5.4 for most. I'm also impressed by Nemotron-3 Super, but its personality feels a bit too dry.
nix.eth tweet medianix.eth tweet medianix.eth tweet medianix.eth tweet media
English
85
74
958
75.8K