Bub

1.4K posts

Bub banner
Bub

Bub

@BubCasto

USA Katılım Eylül 2020
1K Takip Edilen489 Takipçiler
Bub
Bub@BubCasto·
La liga is so corrupt, red for Fede is insane
Português
0
0
0
23
Bub
Bub@BubCasto·
@koenvaneijk @LottoLabs So the trade off is context, got a 5090 also so just trying to wrap my head around all this, I’ll give q6 (and maybe q8) a shot!
Bub tweet media
English
1
0
0
24
Lotto
Lotto@LottoLabs·
Okay tonight we do vLLM vs LMstudio(llama.cpp) checks w/ qwen 27b
English
20
0
123
8.9K
Koen
Koen@koenvaneijk·
@LottoLabs I use Qwen 3.5 27B Q6 on my RTX 5090 with 90k context in llama.cpp-server at 60 tps. It feels like AGI at home.
English
3
0
2
112
FeliZidane
FeliZidane@Felizidaane·
Primera convocatoria de Bellingham después de varios meses de lesión Titular por nombre no por méritos deportivos ni ritmo competitivo (0minutos disputados previamente) resultado: 5-2 y un centro del campo sobrepasado Por favor @aarbeloa17 no caigas en el mismo error
FeliZidane tweet media
Español
11
32
312
19.2K
Bub
Bub@BubCasto·
cool it with the glazing qwen, nbd
Bub tweet media
English
0
0
0
9
Bub
Bub@BubCasto·
is it just me or is there just massive larping across the board? i saw a video about autoresearch and talking about all the things you can do with it to make your business better, but qwen3.5 told me that's not what autoresearch is for. anyways.. im starting from step 1 just to get my feet wet and see what this is really about. i ran the default experiment and got these results, cool.. i guess? lol gonna tweak the settings and run more. like and follow to join me on this journey to who knows where
Bub tweet media
English
0
0
0
13
Bub
Bub@BubCasto·
@Rup4kC The decisive passing at the 1:50 mark and then the sly ball to Valverde in traffic is really beautiful play
English
0
0
4
2.2K
Rupak
Rupak@Rup4kC·
Literally what Vitinha does, we don’t need a new midfielder in the summer is Mami can show he’s capable over a full 90, I wanna see him start some games over Pitarch (not convinced)
English
57
185
3.7K
227.6K
Nous Research
Nous Research@NousResearch·
Hermes Agent v0.3.0 ☤ 248 PRs. 15 contributors. 5 days. • Real-time streaming across CLI and all platforms • First-class plugin architecture, package and share tools+commands+skills • /browser connect to live Chrome via CDP • @vercel AI Gateway model provider • @browser_use browser tool provider • VS Code, Zed, and JetBrains integration • Voice mode with local Whisper • PII redaction everywhere 9 new skills. 50+ bug fixes. Much more in the full changelog.
Nous Research tweet media
English
75
79
1.1K
411.7K
Bub
Bub@BubCasto·
These guys are cooking, boiling lobster if you will
Nous Research@NousResearch

Hermes Agent v0.3.0 ☤ 248 PRs. 15 contributors. 5 days. • Real-time streaming across CLI and all platforms • First-class plugin architecture, package and share tools+commands+skills • /browser connect to live Chrome via CDP • @vercel AI Gateway model provider • @browser_use browser tool provider • VS Code, Zed, and JetBrains integration • Voice mode with local Whisper • PII redaction everywhere 9 new skills. 50+ bug fixes. Much more in the full changelog.

English
0
1
3
68
Lotto
Lotto@LottoLabs·
If you’re so smart why aren’t you running Hermes agent and qwen 3.5 27b?
English
68
15
341
22K
Bub
Bub@BubCasto·
@Zeneca What model?
English
0
0
0
147
Zeneca🔮
Zeneca🔮@Zeneca·
kill me (this is hermes)
Zeneca🔮 tweet media
English
11
0
38
4.4K
Zeneca🔮
Zeneca🔮@Zeneca·
I'm convinced if you want to maximize productivity, you shouldn't be using openclaw or hermes - they take so much time bug fixing that you're better off just using claude code/codex directly there's maybe 1% of people who are the exception to this
English
175
10
451
66.5K
Bub
Bub@BubCasto·
@sudoingX Bumped it to 192K, f16, stable and still getting 65 tok/s. Using about 28/32GB of available VRAM
English
0
0
0
87
Bub
Bub@BubCasto·
@sudoingX I ran that at 32K context, f16 KV cache. Bumped it to 64K context, f16, 66.3 tok/s. Bumped it to 96K context, f16, 65.4 tok/s. Model says I’ll be pushing it to the edge at 128K, f16 so I’m gonna start w q8 and see how it goes
English
1
0
0
42
Sudo su
Sudo su@sudoingX·
drop your GPU below. i'll tell you exactly what model and config to run on it. here's what i've tested and verified on real hardware: RTX 3060 12GB - Qwen 3.5 9B Q4 - 50 tok/s - 128K context RTX 3090 24GB - Qwen 3.5 27B Q4 - 35 tok/s - 300K context RTX 3090 24GB - Qwen 3.5 35B MoE Q4 - 112 tok/s - 262K context 2x RTX 3090 - Qwen3-Coder 80B Q4 - 46 tok/s - full VRAM all running llama.cpp with flash attention. every number is real. every config is tested. if your card isn't on this list drop it below and i'll tell you what fits.
English
727
103
1.6K
190.1K
𝖒 (hype/acc)
𝖒 (hype/acc)@ravespecialist·
cancel your chatgpt subscription and delete your openclaw slop. i'm serious. go on ebay and buy a used RTX 3060 for the price of two months of pro. or check your drawer because half of you already own one and forgot about it. install hermes agent from @NousResearch. one framework, 31 tools, file operations, terminal, browser, code execution. connect it to your local llama.cpp server running qwen 3.5 9B Q4. total download is 5.3 gigs. that's it. that's the whole setup. every experiment you hesitated to run on API. every project you shelved because you didn't want your data on someone else's server. every late night idea you didn't test because you hit your rate limit. all of that is gone. runs 24/7 on your electricity. your machine. your data never leaves your house. connect it to telegram if you want it on your phone. hook up whatever tools you need. the model thinks at 29 tok/s with 128K context and it never bills you. qwen 3.5 9B and one RTX 3060 is the setup most people will never try because they've been trained to believe intelligence has to come from a datacenter. it doesn't. it runs on 12 gigs of VRAM under your desk right now. stop giving your thinking away for free.
English
13
1
57
5.4K
Sudo su
Sudo su@sudoingX·
@ravespecialist @NousResearch this is my post copied word by word. x.com/i/status/20330…
Sudo su@sudoingX

cancel your chatgpt subscription and delete your openclaw slop. i'm serious. go on ebay and buy a used RTX 3060 for the price of two months of pro. or check your drawer because half of you already own one and forgot about it. install hermes agent from @NousResearch. one framework, 31 tools, file operations, terminal, browser, code execution. connect it to your local llama.cpp server running qwen 3.5 9B Q4. total download is 5.3 gigs. that's it. that's the whole setup. every experiment you hesitated to run on API. every project you shelved because you didn't want your data on someone else's server. every late night idea you didn't test because you hit your rate limit. all of that is gone. runs 24/7 on your electricity. your machine. your data never leaves your house. connect it to telegram if you want it on your phone. hook up whatever tools you need. the model thinks at 29 tok/s with 128K context and it never bills you. qwen 3.5 9B and one RTX 3060 is the setup most people will never try because they've been trained to believe intelligence has to come from a datacenter. it doesn't. it runs on 12 gigs of VRAM under your desk right now. stop giving your thinking away for free.

English
8
0
153
3.9K
Bub
Bub@BubCasto·
@meadowdad Carvajal slapped in the face, “actually could be a foul the other way” lol
English
0
0
0
18
AJ Wray
AJ Wray@meadowdad·
@BubCasto Agree. He is horrible. Any time RM gets fouled. “Not a penalty for me”.
English
1
0
1
28
Bub
Bub@BubCasto·
#realmadridelche can we get a neutral commentator on the ESPN+ broadcast? Listening to Stuart Robson is like listening to the away feed.
English
1
0
1
101