Sudo su

6.5K posts

Sudo su

@sudoingX

GPU/local LLM and more RAM.

Bangkok, Thailand شامل ہوئے Ağustos 2022

781 فالونگ13.5K فالوورز

پن کیا گیا ٹویٹ

Sudo su@sudoingX·3d

let me get you started in local AI and bring you to the edge. if you have a GPU or thinking about diving into the local LLM rabbit hole, first thing you do before any setup is join x/LocalLLaMA. this is the community that will help you at every step. post your issue and we will direct you, debug with you, and save you hours of work. once you're in, follow these three: @TheAhmadOsman the oracle. this is where you consume the latest edges in infrastructure and AI. if something dropped you hear it from him first. his content alone will keep you ahead of most. @0xsero one man army when it comes to model compression, novel quantization research, new tools and tricks that make your local setup better. you will learn, experiment, and discover things you didn't know existed. @Teknium maker of Hermes Agent, the agent i use every day from @NousResearch. from Teknium you don't just stay at the frontier, you get your hands on the tools before everyone else. this is where things are headed. if you follow me follow these three and join the community. you will be ahead of most people in this space. if you run into wrong configs, stuck debugging hardware, or can't get a model to load, post there so we can help. get started with local AI now. not only understand the stack but own your cognition. don't pay openai fees on top of giving them your prompts, your research, and your most valuable thinking to be monitored and metered. buy a GPU and build your own token factory.

English

657

32.2K

Sudo su@sudoingX·10m

they are keeping me so compute constrained. I would have double gpus, double ram but hardware prices said no. for now.

English

117

Sudo su@sudoingX·18m

@DasMarky99 some things stay in a little longer.

English

Matu@DasMarky99·21m

@sudoingX Tell us!

English

Sudo su@sudoingX·49m

What have I done. holly shit this is magic a literal magic.

English

933

Sudo su@sudoingX·43m

@dustrollapp the best kind. 👀

English

140

DustRoll@dustrollapp·46m

@sudoingX this is literally psyop magic

English

149

Sudo su@sudoingX·44m

but i understand why some things have to stay in the lab.

English

304

Sudo su@sudoingX·2h

@overcritical @Teknium

GIF

QME

Overcritical@overcritical·2h

@Teknium @sudoingX Wow, that was fast. Marked a PR as ready, merged 3 minutes later 😍 github.com/NousResearch/h… #HermesAgent

English

Sudo su@sudoingX·2h

reward the right behavior long enough and anything learns. RL is just the universe's update rule. good morning btw, lets fucking go..🔥

English

267

Sudo su@sudoingX·2h

every morning i wake up and ask myself what can't be solved with reinforcement learning. still waiting for an answer.

English

608

Sudo su@sudoingX·2h

@outsourc_e very interesting keep up the God's work

English

Eric ⚡️ Building...@outsourc_e·2h

@sudoingX Yes for hackathon - Native Desktop + Mobile HermesUI github.com/outsourc-e/her…

English

Sudo su@sudoingX·2h

Hermes agent from bed

English

1.3K

Sudo su@sudoingX·2h

@outsourc_e did you wrap the majestic agent to workspace ? interesting.

English

Eric ⚡️ Building...@outsourc_e·2h

@sudoingX ⭐️

QME

Sudo su@sudoingX·2h

@mfranz_on anon upgrade yourself from bloated typescript mess to majestic Hermes agent. You deserve better tools. anyone who hasn't.

English

391

Marco Franzon@mfranz_on·9h

Hermes agent.

Yash@yashhsm

x.com/i/article/2033…

English

2.4K

Sudo su@sudoingX·2h

@yeahfortommy @NousResearch @yoniebans how do I get my hands on those shirts with lady

English

Tommy@yeahfortommy·11h

Are we a merch company or a publishing company?

English

107

6.2K

Sudo su ری ٹویٹ کیا

Nous Research@NousResearch·6h

Hermes Agent wrote a novel. "The Second Son of the House of Bells" runs 79,456 words across 19 chapters. The agent built its own pipeline to do it, using the ame modify-evaluate-keep/discard loop as @karpathy's Autoresearch but applied to fiction: world-building, chapter drafting, adversarial editing, Opus review loops, LaTeX typesetting, cover art, audiobook generation, and landing page setup. Book: nousresearch.com/bells Code: github.com/NousResearch/a…

emozilla@theemozilla

it's been a longstanding dream of mine build an ai system that can tell a compelling story. it's what got me started in the space in the beginning, and with Hermes Agent I finally pulled it off 100% written, typeset, etc. by Hermes Agent those at our gtc event got hard copies🤗

English

732

61.1K

Sudo su ری ٹویٹ کیا

Sudo su@sudoingX·20h

hear this anon you don't need a $4,699 box to get started local AI. use what you already have first. test your workload. this is what a $250 GPU did today. iteration 3 of octopus invaders is here. 4 phases. 6 prompts. zero handwritten code. the same 9B on the same 3060 fixed its own enemy spawning, patched a dual start conflict, added level progression, resized every bullet, and when the browser cached old files it figured that out on its own and added version parameters to force reload. 3,200+ lines across 13 files. every line by qwen 3.5 9B Q4 at 35-50 tok/s on 12 gigs through hermes agent. understand what your load actually needs before you build. don't get trapped by influencers selling you boxes next to a plant. test on what you have. then decide. this 3060 impressed me in ways i did not expect and its autonomy is what kept me going. now its time to move to new experiments on other nodes and other models for all of us. if you are running this setup the exact stack, flags, and open source code, exact prompts i used are in the replies. if you run into issues let me know. seeing students and builders discover hermes from my posts and start running local is why i do this. full autonomous build at 8x speed in the video. gameplay at the end. watch it.

Sudo su@sudoingX

this is what 12 gigs of VRAM built in 2026. a 9 billion parameter model running on a 5 year old RTX 3060 wrote a full space shooter from a single prompt. blank screen on first try. i came back with a bug list and the same model on the same card fixed every issue across 11 files without touching a single line myself. enemies still looked wrong so i pushed another iteration and now the game has pixel art octopi, particle effects, screen shake, projectile physics and a combo system. all running locally on a card that was designed to play fortnite. three iterations. zero cloud. zero API calls. every token generated on hardware sitting under my desk. the model reads its own code, finds what's broken, patches it, validates syntax and restarts the server. i just describe what's wrong and it handles the rest. people are paying monthly subscriptions to type into a browser tab and wait for a server farm to respond. meanwhile a GPU you can find used on ebay is running a full autonomous hermes agent framework with 31 tools, 128K context window and thinking mode generating at 29 tokens per second nonstop. the game still needs work. level upgrades don't trigger and boss fights need tuning. but the fact that i'm iterating on gameplay balance instead of debugging whether the code runs at all tells you where this is headed. every iteration the game gets better on the same hardware. same 12 gigs. same 9 billion parameters. same RTX 3060 from 5 years ago your GPU is not a gaming card anymore. it's a local AI lab that never sends your data anywhere.

English

391

37.6K

Sudo su@sudoingX·11h

oss everywhere

English

559

Sudo su@sudoingX·11h

i woke up. i conquered the field and now i am setting. good night builders. this is an open source world. they are just living in it.

English

1.2K

Sudo su@sudoingX·11h

mac and linux handle the openai-compatible endpoints differently. the fix in the latest update queries /v1/models and /props from your server on startup so it reads the real context. make sure you're on the latest hermes version on both machines. if the mac still shows 2M after updating share your config.yaml and llama-server command and i'll debug it.

English

113

shane@shaneswrld_·11h

@sudoingX my model shows but my mac is reporting a 2m context window for minimax2.7.. however my hermes running on my linux server, same model, is report 200k window. What system are you on.. it might be a system specific error for macs? They said hermes runs best on linux.

English

133

Sudo su@sudoingX·15h

if you run hermes agent on a local GPU your status bar probably shows claude-opus-4.6 and 2M context even when you're running a completely different model with 128K context. every local user hits this. submitted a fix. hermes now auto-detects your actual model name and context length from your local server on startup. no manual config needed. no more wrong branding. PR live and should be merged soon. this team ships fast.

Sudo su@sudoingX

i keep getting this question. hermes agent shows 2M context when you're running a local model with 128K. it's not a bug in your setup. hermes was designed API first so it defaults to the highest probe tier for unknown models. your llama-server knows the real context but hermes doesn't ask it. i patched mine manually. now planning a PR to auto detect model name and context from your local server on startup. one API call on init fixes it for every local runner. if you're hitting this right now the workaround is in the reply.

English

152

7.3K

Sudo su@sudoingX·11h

@Teknium @rronak_ @YinjieW2024 @MengdiWang10 @johnschulman2 that spark is going to change everything for local testing. looking forward to what comes next.

English

Teknium (e/λ)@Teknium·11h

@sudoingX @rronak_ @YinjieW2024 @MengdiWang10 and thanks to lora working so well for RL (thanks @johnschulman2 ) - This whole loop might be possible with a spark or 2 ;0 Also sudo I got a spark at GTC, I should be able to much more actively test local models integration now! woo

English

235

Ronak Malde@rronak_·1d

This paper is almost too good that I didn't want to share it Ignore the OpenClaw clickbait, OPD + RL on real agentic tasks with significant results is very exciting, and moves us away from needing verifiable rewards Authors: @YinjieW2024 Xuyang Chen, Xialong Jin, @MengdiWang10 @LingYang_PU