Sudo su

6.5K posts

Sudo su banner
Sudo su

Sudo su

@sudoingX

GPU/local LLM and more RAM.

Bangkok, Thailand شامل ہوئے Ağustos 2022
781 فالونگ13.5K فالوورز
پن کیا گیا ٹویٹ
Sudo su
Sudo su@sudoingX·
let me get you started in local AI and bring you to the edge. if you have a GPU or thinking about diving into the local LLM rabbit hole, first thing you do before any setup is join x/LocalLLaMA. this is the community that will help you at every step. post your issue and we will direct you, debug with you, and save you hours of work. once you're in, follow these three: @TheAhmadOsman the oracle. this is where you consume the latest edges in infrastructure and AI. if something dropped you hear it from him first. his content alone will keep you ahead of most. @0xsero one man army when it comes to model compression, novel quantization research, new tools and tricks that make your local setup better. you will learn, experiment, and discover things you didn't know existed. @Teknium maker of Hermes Agent, the agent i use every day from @NousResearch. from Teknium you don't just stay at the frontier, you get your hands on the tools before everyone else. this is where things are headed. if you follow me follow these three and join the community. you will be ahead of most people in this space. if you run into wrong configs, stuck debugging hardware, or can't get a model to load, post there so we can help. get started with local AI now. not only understand the stack but own your cognition. don't pay openai fees on top of giving them your prompts, your research, and your most valuable thinking to be monitored and metered. buy a GPU and build your own token factory.
Sudo su tweet media
English
40
50
657
32.2K
Sudo su
Sudo su@sudoingX·
they are keeping me so compute constrained. I would have double gpus, double ram but hardware prices said no. for now.
English
0
0
6
117
Sudo su
Sudo su@sudoingX·
What have I done. holly shit this is magic a literal magic.
English
3
0
14
933
Sudo su
Sudo su@sudoingX·
but i understand why some things have to stay in the lab.
English
1
0
0
304
Sudo su
Sudo su@sudoingX·
reward the right behavior long enough and anything learns. RL is just the universe's update rule. good morning btw, lets fucking go..🔥
English
0
0
1
267
Sudo su
Sudo su@sudoingX·
every morning i wake up and ask myself what can't be solved with reinforcement learning. still waiting for an answer.
English
2
0
12
608
Sudo su
Sudo su@sudoingX·
Hermes agent from bed
English
2
1
36
1.3K
Sudo su
Sudo su@sudoingX·
@outsourc_e did you wrap the majestic agent to workspace ? interesting.
English
1
0
1
64
Sudo su
Sudo su@sudoingX·
@mfranz_on anon upgrade yourself from bloated typescript mess to majestic Hermes agent. You deserve better tools. anyone who hasn't.
English
0
1
19
391
Tommy
Tommy@yeahfortommy·
Are we a merch company or a publishing company?
Tommy tweet media
English
27
7
107
6.2K
Sudo su ری ٹویٹ کیا
Nous Research
Nous Research@NousResearch·
Hermes Agent wrote a novel. "The Second Son of the House of Bells" runs 79,456 words across 19 chapters. The agent built its own pipeline to do it, using the ame modify-evaluate-keep/discard loop as @karpathy's Autoresearch but applied to fiction: world-building, chapter drafting, adversarial editing, Opus review loops, LaTeX typesetting, cover art, audiobook generation, and landing page setup. Book: nousresearch.com/bells Code: github.com/NousResearch/a…
Nous Research tweet media
emozilla@theemozilla

it's been a longstanding dream of mine build an ai system that can tell a compelling story. it's what got me started in the space in the beginning, and with Hermes Agent I finally pulled it off 100% written, typeset, etc. by Hermes Agent those at our gtc event got hard copies🤗

English
39
51
732
61.1K
Sudo su ری ٹویٹ کیا
Sudo su
Sudo su@sudoingX·
hear this anon you don't need a $4,699 box to get started local AI. use what you already have first. test your workload. this is what a $250 GPU did today. iteration 3 of octopus invaders is here. 4 phases. 6 prompts. zero handwritten code. the same 9B on the same 3060 fixed its own enemy spawning, patched a dual start conflict, added level progression, resized every bullet, and when the browser cached old files it figured that out on its own and added version parameters to force reload. 3,200+ lines across 13 files. every line by qwen 3.5 9B Q4 at 35-50 tok/s on 12 gigs through hermes agent. understand what your load actually needs before you build. don't get trapped by influencers selling you boxes next to a plant. test on what you have. then decide. this 3060 impressed me in ways i did not expect and its autonomy is what kept me going. now its time to move to new experiments on other nodes and other models for all of us. if you are running this setup the exact stack, flags, and open source code, exact prompts i used are in the replies. if you run into issues let me know. seeing students and builders discover hermes from my posts and start running local is why i do this. full autonomous build at 8x speed in the video. gameplay at the end. watch it.
Sudo su@sudoingX

this is what 12 gigs of VRAM built in 2026. a 9 billion parameter model running on a 5 year old RTX 3060 wrote a full space shooter from a single prompt. blank screen on first try. i came back with a bug list and the same model on the same card fixed every issue across 11 files without touching a single line myself. enemies still looked wrong so i pushed another iteration and now the game has pixel art octopi, particle effects, screen shake, projectile physics and a combo system. all running locally on a card that was designed to play fortnite. three iterations. zero cloud. zero API calls. every token generated on hardware sitting under my desk. the model reads its own code, finds what's broken, patches it, validates syntax and restarts the server. i just describe what's wrong and it handles the rest. people are paying monthly subscriptions to type into a browser tab and wait for a server farm to respond. meanwhile a GPU you can find used on ebay is running a full autonomous hermes agent framework with 31 tools, 128K context window and thinking mode generating at 29 tokens per second nonstop. the game still needs work. level upgrades don't trigger and boss fights need tuning. but the fact that i'm iterating on gameplay balance instead of debugging whether the code runs at all tells you where this is headed. every iteration the game gets better on the same hardware. same 12 gigs. same 9 billion parameters. same RTX 3060 from 5 years ago your GPU is not a gaming card anymore. it's a local AI lab that never sends your data anywhere.

English
25
26
391
37.6K
Sudo su
Sudo su@sudoingX·
oss everywhere
English
0
0
4
559
Sudo su
Sudo su@sudoingX·
i woke up. i conquered the field and now i am setting. good night builders. this is an open source world. they are just living in it.
English
2
0
44
1.2K
Sudo su
Sudo su@sudoingX·
mac and linux handle the openai-compatible endpoints differently. the fix in the latest update queries /v1/models and /props from your server on startup so it reads the real context. make sure you're on the latest hermes version on both machines. if the mac still shows 2M after updating share your config.yaml and llama-server command and i'll debug it.
English
2
0
1
113
shane
shane@shaneswrld_·
@sudoingX my model shows but my mac is reporting a 2m context window for minimax2.7.. however my hermes running on my linux server, same model, is report 200k window. What system are you on.. it might be a system specific error for macs? They said hermes runs best on linux.
English
1
0
0
133
Sudo su
Sudo su@sudoingX·
if you run hermes agent on a local GPU your status bar probably shows claude-opus-4.6 and 2M context even when you're running a completely different model with 128K context. every local user hits this. submitted a fix. hermes now auto-detects your actual model name and context length from your local server on startup. no manual config needed. no more wrong branding. PR live and should be merged soon. this team ships fast.
Sudo su@sudoingX

i keep getting this question. hermes agent shows 2M context when you're running a local model with 128K. it's not a bug in your setup. hermes was designed API first so it defaults to the highest probe tier for unknown models. your llama-server knows the real context but hermes doesn't ask it. i patched mine manually. now planning a PR to auto detect model name and context from your local server on startup. one API call on init fixes it for every local runner. if you're hitting this right now the workaround is in the reply.

English
10
1
152
7.3K
Ronak Malde
Ronak Malde@rronak_·
This paper is almost too good that I didn't want to share it Ignore the OpenClaw clickbait, OPD + RL on real agentic tasks with significant results is very exciting, and moves us away from needing verifiable rewards Authors: @YinjieW2024 Xuyang Chen, Xialong Jin, @MengdiWang10 @LingYang_PU
Ronak Malde tweet media
English
31
126
1.1K
140K
Sudo su
Sudo su@sudoingX·
@nikitabier the screen is running a local model at 50 tok/s and it doesn't need my attention. that's the point.
English
2
0
14
753
Nikita Bier
Nikita Bier@nikitabier·
No, there’s nothing over there. Come back to the screen.
English
1.6K
203
3.5K
166.7K