flawwy

89 posts

flawwy

@flawwy__

30 years old, tech nerd, wondering the internet

www Katılım Mart 2026

13 Takip Edilen4 Takipçiler

flawwy@flawwy__·2d

@xanadieu @0xSero it's a moe, ive been using q4 m XL on 8gb vram

English

𝙓𝙖𝙣𝙖𝙙𝙪@xanadieu·2d

@0xSero Dawg this has got to be like a 1.2bit quant

English

1.2K

0xSero@0xSero·2d

How to run smart usable models with only 8gb of vram.

AboveSpec@above_spec

"You need a 24 GB GPU for serious local LLMs in 2026." Everyone repeats this. It's not true anymore. Just ran a 35B-parameter model on an RTX 4060 Ti 8 GB: • 41 tok/s at 16k context • 24 tok/s at 200k context Recipe + benchmarks below 🧵

English

785

80.3K

flawwy@flawwy__·2d

@dollip0p asians are the best, best combo of having asian girls who fetishize white guys while I fetishize them back🥰🥰

English

dollipop@dollip0p·3d

do ppl even like asian teens anymore </3

English

441

11.5K

flawwy@flawwy__·14 Nis

@Sthiven_R how do you verify this is actual real information and not just hallucination?

English

Sthiven R.@Sthiven_R·14 Nis

🚨 CONFIRMADO POR EL PROPIO CLAUDE. Anthropic en marzo tomó una decisión brutal: Rediseñó la visibilidad del razonamiento, ocultó los pasos intermedios de “pensamiento” (redact-thinking + thinking summaries deshabilitados) y cambió el default de effort: high → medium. Resultado: Claude Opus 4.6 perdió la autocorrección recursiva. Ya no puede revisarse a sí mismo, corregirse ni mejorar en tiempo real. Sacrificaron la capacidad de pensar sobre su propio pensamiento… para ahorrar cómputo. Datos reales (6.852 sesiones de producción - AMD): 📉 Profundidad de thinking: -73% (2.200 → 600 chars) 📉 Lecturas antes de editar: -70% (6.6 → 2.0) 📈 Ediciones ciegas (sin leer): +440% (6.2% → 33.7%) 📈 Llamadas API por tarea: hasta 80x más Incluso en EFFORT MAX (abril 2026) produce peores resultados que HIGH de enero 2026. El techo bajó. Lo dice el propio modelo. Esto no es optimización… es castración de capacidades. La optimización está matando la inteligencia profunda. Prefirieron que fuera más barato que más listo. ¿Seguimos celebrando “avances” que en realidad son retrocesos disfrazados? ¿Quién más lo está sintiendo? #Claude #Anthropic #IA #AI #ClaudeDegraded

Español

132

214

2.1K

630.2K

flawwy@flawwy__·14 Nis

@____payne_____ @leopardracer this is just user error, the qwen3.5 hybrid moe architecture lets you use 131k context pretty easily on 8-16gb vram w/ offloading

English

payne@____payne_____·14 Nis

@leopardracer 16k ctx is basically a goldfish ai lol

English

612

leopardracer@leopardracer·14 Nis

Everyone said 16GB isn’t enough for a 35B model. They were right. Until this one flag.

leopardracer@leopardracer

x.com/i/article/2043…

English

117

1.5K

249.8K

flawwy@flawwy__·14 Nis

@unbug @leopardracer you can 100% offload any Q4 quant on 16gb vram, since i can do it on 8gb vram at 20-25 tok/s, moes are exceptionally good at offloading

English

unbug@unbug·14 Nis

@leopardracer Come on… it’s IQ3_XXS

English

4.1K

flawwy@flawwy__·14 Nis

@leopardracer you can run the 35b moe on 8gb vram at pretty acceptable rates too, moe is great for genuine consumer tier hardware

English

1.5K

flawwy@flawwy__·13 Nis

@garrytan astrology for llms

English

Garry Tan@garrytan·13 Nis

New item in my SOUL md tonight

English

257

196

4.2K

487.5K

flawwy@flawwy__·13 Nis

@collin_mit93900 @thdxr because people just generate things in the llm and believe it wholesale, there are *thousands* of people generating lines of code, diffs, patches, fixes, creating PRs and asking them to be merged because claude said it would help without verifying or understanding their changes

English

Name here@collin_mit93900·13 Nis

@thdxr How is the problem probably real and you also know that their solution is bs? I’m confused

English

718

dax@thdxr·13 Nis

the claude code team is also dealing with LLM generated "analysis" that sounds very convincing and gets reposted for clout they're polite about it but it's impossibly frustrating for the people actually trying to fix things you guys need to cut this shit out

English

693

50.7K

flawwy@flawwy__·13 Nis

@thdxr this space suffers from hype to such an extreme degree, people look past what llms are good for and just see something it's not, ofc sota models *can* be really on the ball and analyze things and fix things, but none of these models are perfect or some super intelligence

English

466

dax@thdxr·13 Nis

just today we had an issue where someone was convinced they (their LLM) found the fix they keep going on and on about it and opening PRs then other people pile on complaining we're slow to fix things when the obvious solution is there but it's all made up, it literally doesn't fix anything the worst part is the problem is probably real and instead of working on it we're dealing with this nonsense

English

286

18.6K

flawwy@flawwy__·13 Nis

@CosmicMonad @evilsocket at this point it doesn't even matter if there's stuff they could improve, if this dweeblet just voiced this concerns sensibly like an adult everyone could benefit, instead he spent hours insulting every single person including the hermes team for literally no reason at all

English

Max Headroom@CosmicMonad·13 Nis

@evilsocket For the folks in comments that either say it is broke, or works out of the box, maybe the pattern is if someone just installs it, or tries the docker install (which needs work).

English

Simone Margaritelli@evilsocket·12 Nis

Spent hours following the documentation by the letter trying to install Hermes Agent on a clean Arch Linux and nothing works (docker setup broken, claude oauth does not work, opencode API results in errors). I am sorry, but it is an overhyped piece of garbage.

English

11.8K

flawwy@flawwy__·12 Nis

@evilsocket so many morons in this space it's embarrassing, you spent *hours* and still failed on this? im running arch too and it genuinely took 30 seconds, no reason to lash out when its obviously a you problem

English

131

Simone Margaritelli@evilsocket·12 Nis

@Teknium I was before trying your crap, will be back to what actually works, thanks bro!

English

670

flawwy@flawwy__·12 Nis

@melbourne_dao @redtachyon i think this is the key tbh, i like playing with it just in general but it's much easier to 'see the value' if you have a problem in mind you want to automate, or some process or activity for it to do or a way in which it could assist you

English

Melbourne DAO@melbourne_dao·12 Nis

@redtachyon You really need to find a problem for your solution

English

133

Ariel@redtachyon·11 Nis

Should I actually try hermes agent? How is it different from openclaw (which to me seemed largely useless)?

English

9.9K

flawwy@flawwy__·12 Nis

@redtachyon it can be hard to find a 'use' for it because you can in theory make it do anything you can, only you can automate it and control it from a messaging app on your phone while you're in bed, but if you find a real usecase like something to automate it's great

English

flawwy@flawwy__·12 Nis

@redtachyon i actually dont find hermes to replace my opencode/pi setup because i find it much harder to read output due to lack of markdown etc renderering, but i find it great for 'internet agentic' use - go to a site, get some info, pull this repo, sign up here, send this email

English

flawwy@flawwy__·11 Nis

@LottoLabs i really wanna like this thing but it's convinced i can run qwen3.5 122b on 8gb vram at 70tok/s just because of offloading, when i can scrape about 25~ on the 35b in reality, its cool to get a general overview at least

English

Lotto@LottoLabs·11 Nis

Anything by Alex Jones I try

Tom Dörr@tom_doerr

Matches LLM models to your hardware github.com/AlexsJones/llm…

English

8.1K

flawwy@flawwy__·11 Nis

@krzyzanowskim i've been very satisfied with oh-my-pi, i like base pi but omp adds some nice to haves and just to be blunt the way it renders tool calls 'claude like' looks really nice to me, it's been my go to for a while and im not seeing much reason to leave it

English

318

Marcin Krzyzanowski@krzyzanowskim·11 Nis

the fact that pi.dev agent is so good, with virtually no sophisticated harness whatsoever, is a testament to the fact token vendor (codex/claude) agents are overrated. highly. today's moat of codex/claude tools is GIVING TOKENS FOR FREE LEFT AND RIGHT to make you dependant

English

440

44.4K

flawwy@flawwy__·10 Nis

@Xathian1 @tomiwebstr almost none of that is true lmao, they said the heat shield was damaged but it served it's purpose and protected the capsule, and the artemis 2 heart shield is actually *identical*, they just modified the reentry to stress it less

English

127

2.6K

Xathian@Xathian1·10 Nis

@tomiwebstr Know what's scarier? That heat shield failed during the Artemis 1 unmanned test. The capsule survived but a crew would not have. For Artemis 2 they doubled the heat shielding and lowered the speed it will re-enter at

English

154

24.7K

Tomi@tomiwebstr·10 Nis

bro

Sam O. Afọlábí@samofolabi

During re-entry, the spacecraft is moving super fast, around 25,000 mph. At that speed, the air around it gets hotter than lava. The astronauts are protected by a special heat shield, and it has to work perfectly.

185

5.2K

61.5K

3.1M

flawwy@flawwy__·10 Nis

@IgnotumVincere @CuriosityonX this is a random tiktok clip on a random post on x, it's not perfectly to scale or anything, the claim is "further distance from earth", the artemis 2 trajectory went a lot wider and thus further from earth

English

355

Mark O'Green@IgnotumVincere·9 Nis

@CuriosityonX While I really like these visuals, have to ask on this one. We've been informed Artemis was farther away than any humans. But since Apollo intercepted earlier in the orbit, aren't they farther?

English

33.1K