Gene Torres

@einobaka

Katılım Ekim 2022

157 Takip Edilen15 Takipçiler

Gene Torres@einobaka·1d

@sudoingX I got 2x 3090, still experimenting with models (moving to Gemma 4 31b Q8 soon)

English

1.6K

Sudo su@sudoingX·1d

buy a gpu. 3090, 4090, dgx spark, whatever fits your budget. tier doesn't matter. running your first local model does. the moment your first prompt lands with no api between you and the model, your brain rewires. that single moment is worth more than every take you'll ever read on a timeline.

English

630

27.9K

Gene Torres@einobaka·2d

@sudoingX Would you run the same model on 2x 3090 but with a lower quant? Like q8?

English

126

Sudo su@sudoingX·2d

i declare qwen 3.6 27b dense q4 the king of a single rtx 3090 card. not even close. this model is absolute beast on local ai, ruthless on agentic loops, owns its own thinking. anyone can use it on single 3090, the weights are open, the stack is reproducible, the prompt is canonical, every claim below is verifiable on your own hardware. the octopus invaders one shot you are seeing is the visible test. i run these models on workloads you wouldn't think to ask for and i couldn't show you if i wanted to, and qwen 3.6 27b dense q4 quietly does the heavy lifting on a single consumer card while the rest of the field is busy explaining why it cannot. if you think a different model is king on a single 3090 right now, name it. drop your card, drop your model, drop your numbers. the throne is not crowded.

Sudo su@sudoingX

update: qwen 3.6 27b dense q4 just one shotted octopus invaders game on a single 3090. hermes agent drove the whole thing, ~41 tok/s gen 21gb vram at full 262k context, thinking mode on. one prompt in and the canonical multi-file space shooter benchmark out, the same exact prompt i ran on qwen 3.5 27b dense back in march on the same card. 3.5 needed one external scope bug fix before the game would even load on first play. 3.6 needed nothing. 11 of 11 files written, 2411 lines of code, zero steering interventions, zero external fixes, playable on first load. 16 minutes 41 seconds wall clock from prompt to playable. consumer tier king on a single 3090 is locked tonight, and the silicon underneath my desk did not change between march and now. the open source ecosystem just moved the floor. watch it ship itself, the full 16 minutes 41 seconds sped to 3 minutes 45, no human touched the keyboard between the first prompt and the final frame.

English

493

39.2K

Gene Torres@einobaka·24 Nis

@sudoingX I got 2x 3060 12gb I'm wanting to pair on a split lane mobo, just waiting on the parts. I know it won't be as fast as a single card, but I'm looking forward to trying out what you've been posting!

English

301

Sudo su@sudoingX·23 Nis

dude! the new qwen 3.6-27b dense is hammering my single 3090 at 100% gpu utilization. the spiky pattern on nvtop is the hermes agent autonomously thinking, calling tools, reading results, thinking again. this model is so cool to talk to. waits for tool outputs, reads them, selfcorrects, keeps going. no stalls, no loops, no hand holding. anyone running a single 3090 or any 24gb tier card should try this. same llama.cpp flags from last sweep, same hermes agent install. three commands and you are watching your own hardware think.

Sudo su@sudoingX

before i touch any turbo or quant tricks on the new qwen 3.6-27b dense, i ran a full context sweep on a single 3090. same flags as the march qwen 3.5 baseline. same hardware. the architecture is inherited. exact same vram footprint as 3.5 at every context: 16 gigs at 4k, 18 at 128k, 21 at 262k, 23 at the 376k vram wall. and 13.7 percent faster on identical config. 40.13 tok/s vs 35.30. flat curve from 4k to the wall. next is autonomous agent tasks on hermes agent. single file edits, multi file changes, tool calls, ui builds. i post results as they land, you judge. octopus invaders closes the run as the standard benchmark.

English

197

22.1K

Gene Torres@einobaka·8 Nis

@sudoingX Would an 7900XTX also work in lieu of a 3090? They're getting absurdly priced in the used market

English

115

Sudo su@sudoingX·8 Nis

switch to linux. wsl works but you'll spend more time debugging windows quirks than actually running models. at 2am when something breaks you want it to just work. ubuntu on a 3090 with llama.cpp and hermes agent just works. make the switch and never look back.

tuba@tubatrades

@sudoingX @Teknium @NousResearch hey sudo, is there a way to utilize my 3090 safely with hermes agent and a local model if I have the 3090 in my windows system? I saw that you can use WSL but not sure if thats secure enough

English

172

9.9K

Gene Torres@einobaka·20 Oca

@antmillionsbot 🤔🤔

QME