Sudo su

6.6K posts

Sudo su banner
Sudo su

Sudo su

@sudoingX

GPU/local LLM and more RAM.

Bangkok, Thailand انضم Ağustos 2022
783 يتبع13.5K المتابعون
تغريدة مثبتة
Sudo su
Sudo su@sudoingX·
let me get you started in local AI and bring you to the edge. if you have a GPU or thinking about diving into the local LLM rabbit hole, first thing you do before any setup is join x/LocalLLaMA. this is the community that will help you at every step. post your issue and we will direct you, debug with you, and save you hours of work. once you're in, follow these three: @TheAhmadOsman the oracle. this is where you consume the latest edges in infrastructure and AI. if something dropped you hear it from him first. his content alone will keep you ahead of most. @0xsero one man army when it comes to model compression, novel quantization research, new tools and tricks that make your local setup better. you will learn, experiment, and discover things you didn't know existed. @Teknium maker of Hermes Agent, the agent i use every day from @NousResearch. from Teknium you don't just stay at the frontier, you get your hands on the tools before everyone else. this is where things are headed. if you follow me follow these three and join the community. you will be ahead of most people in this space. if you run into wrong configs, stuck debugging hardware, or can't get a model to load, post there so we can help. get started with local AI now. not only understand the stack but own your cognition. don't pay openai fees on top of giving them your prompts, your research, and your most valuable thinking to be monitored and metered. buy a GPU and build your own token factory.
Sudo su tweet media
English
42
50
661
32.7K
Sudo su
Sudo su@sudoingX·
been getting DMs and comments asking how to support the open source work. i don't take donations or tokens. everything i ship is free and stays free. if you want to back the mission the only way is the $12/mo X sub. that funds GPU hours, benchmarks, and more open source releases. DM me your GPU after subscribing and i'll personally help you set up.
Grim@GrimCreep1

@sudoingX Are you open to taking donations on the GitHub?

English
0
1
26
1K
Sudo su
Sudo su@sudoingX·
@GrimCreep1 appreciate that. best way to support is the $12/mo sub. DM me your GPU and i'll help you set up. every sub funds more GPU hours and more open source.
English
0
0
0
22
Grim
Grim@GrimCreep1·
@sudoingX Are you open to taking donations on the GitHub?
English
1
0
0
1K
Sudo su
Sudo su@sudoingX·
hear this anon you don't need a $4,699 box to get started local AI. use what you already have first. test your workload. this is what a $250 GPU did today. iteration 3 of octopus invaders is here. 4 phases. 6 prompts. zero handwritten code. the same 9B on the same 3060 fixed its own enemy spawning, patched a dual start conflict, added level progression, resized every bullet, and when the browser cached old files it figured that out on its own and added version parameters to force reload. 3,200+ lines across 13 files. every line by qwen 3.5 9B Q4 at 35-50 tok/s on 12 gigs through hermes agent. understand what your load actually needs before you build. don't get trapped by influencers selling you boxes next to a plant. test on what you have. then decide. this 3060 impressed me in ways i did not expect and its autonomy is what kept me going. now its time to move to new experiments on other nodes and other models for all of us. if you are running this setup the exact stack, flags, and open source code, exact prompts i used are in the replies. if you run into issues let me know. seeing students and builders discover hermes from my posts and start running local is why i do this. full autonomous build at 8x speed in the video. gameplay at the end. watch it.
Sudo su@sudoingX

this is what 12 gigs of VRAM built in 2026. a 9 billion parameter model running on a 5 year old RTX 3060 wrote a full space shooter from a single prompt. blank screen on first try. i came back with a bug list and the same model on the same card fixed every issue across 11 files without touching a single line myself. enemies still looked wrong so i pushed another iteration and now the game has pixel art octopi, particle effects, screen shake, projectile physics and a combo system. all running locally on a card that was designed to play fortnite. three iterations. zero cloud. zero API calls. every token generated on hardware sitting under my desk. the model reads its own code, finds what's broken, patches it, validates syntax and restarts the server. i just describe what's wrong and it handles the rest. people are paying monthly subscriptions to type into a browser tab and wait for a server farm to respond. meanwhile a GPU you can find used on ebay is running a full autonomous hermes agent framework with 31 tools, 128K context window and thinking mode generating at 29 tokens per second nonstop. the game still needs work. level upgrades don't trigger and boss fights need tuning. but the fact that i'm iterating on gameplay balance instead of debugging whether the code runs at all tells you where this is headed. every iteration the game gets better on the same hardware. same 12 gigs. same 9 billion parameters. same RTX 3060 from 5 years ago your GPU is not a gaming card anymore. it's a local AI lab that never sends your data anywhere.

English
27
32
437
43.1K
Sudo su
Sudo su@sudoingX·
@r0ck3t23 the gap between "AI will destroy us" and "I ran a 9B model on a $300 GPU and it built a game" is the entire problem with this conversation. builders know what this is. commentators don't.
English
2
0
14
296
Dustin
Dustin@r0ck3t23·
Jensen Huang just told every AI leader in the room to grow up. Stop scaring the public with science fiction. Start communicating like the weight of civilization is on your shoulders. Because it is. Huang: “AI is not a biological being. It is not alien. It is not conscious. It is computer software.” That single statement dismantles half the panic surrounding this industry. The mainstream conversation is dominated by people projecting human malice onto math. Alien consciousness onto code. Existential dread onto a software architecture we built, we trained, and we can read. Huang: “We say things like, ‘We don’t understand it at all.’ It is not true. We understand a lot of things about this technology.” When builders tell the public they don’t understand their own creation, the public hears threat. The state responds with control. That is already happening. Palihapitiya asked Huang what he would have told Anthropic during their regulatory clash with the Department of Defense. Huang didn’t attack the technology. He attacked the communication. Huang: “The desire to warn people about the capability of the technology is really terrific. We just have to make sure that we understand that the world has a spectrum, and that warning is good, scaring is less good because this technology is too important to us.” Warning shows risks, mitigation, why upside overwhelms downside. Scaring says we might be building something that destroys us and we can’t stop it. One builds trust. The other invites regulation written in panic. Huang: “To say things that are quite extreme, quite catastrophic, that there’s no evidence of it happening, could be more damaging than people think.” Projecting catastrophe without evidence is not caution. It is sabotage. When your technology is embedded in national defense, the financial system, and healthcare infrastructure, your words carry structural weight. If the architects act terrified of their own product, the response is predictable. Governments step in. They restrict. They seize control of something they don’t understand because the builders told them to be afraid. Huang: “There was a time when nobody listened to us, but now because technology is so important in the social fabric, such an important industry, so important to national security, our words do matter.” Most tech founders have not internalized this. You are no longer a startup founder disrupting an industry. You are running infrastructure that nations depend on. Your statements move policy. Your framing shapes legislation. Your tone determines whether governments treat you as partner or threat. Huang: “We have to be much more circumspect, we have to be more moderate, we have to be more balanced, we have to be far more thoughtful.” Huang did not ask for silence. He asked for precision. The leaders who cannot tell the difference will not be leading for long.
English
102
86
443
41.2K
Sudo su
Sudo su@sudoingX·
@signulll this is exactly what running local models teaches you. you stop writing code and start evaluating it. the model outputs, you judge. the faster you can spot what's off the faster you ship.
English
0
0
0
105
signüll
signüll@signulll·
with ai increasingly writing more & more code, engineers shift from makers to critics. taste, judgment, & the ability to recognize when something is wrong without being able to immediately articulate why is what compounds now more than ever before. i.e. the terminal skill is aesthetic discernment applied to large systems, which was always rare as hell & is now the only scarce thing.
English
32
6
103
5.9K
Sudo su
Sudo su@sudoingX·
@xeraphims so openai already proved the calc tool approach works. they just over optimized the trigger. the tool itself was the right call. now the open source local stack needs the same tool without the reward hacking.
English
0
0
2
257
Sudo su
Sudo su@sudoingX·
thinking out loud. every model gets math wrong. 7B, 9B, 70B. doesn't matter. pattern matching is not computation. hermes agent has code_execution which spins up a full python sandbox with RPC over unix sockets. powerful but heavy. a 9B isn't going to navigate that reliably for basic arithmetic. what if there was a lightweight calc tool built in. model hits a math question, calls the tool, gets the exact answer computed on your hardware. no interpreter overhead. sandboxed. simple enough schema that a 9B can call it every time. the accuracy problem stops being a model problem and becomes an infrastructure problem. and infrastructure is solvable. @Teknium would this belong in hermes agent or is code_execution enough?
English
22
5
133
7.4K
Sudo su
Sudo su@sudoingX·
@yaboilyrical how do i get my hands on one of those. shipping to bangkok is worth it for hermes merch.
English
0
0
0
28
Sudo su
Sudo su@sudoingX·
@uttertard the language doesn't matter much. the key is the schema the model sees. one field, expression in, answer out. whether the backend is JS, python, or raw C the model just needs to output "847 * 293" and get the right number back.
English
1
0
1
499
uttertard
uttertard@uttertard·
@sudoingX If you want to skip python would a calculator built in javascript and passed as a skill make sense?
English
1
0
2
565
Sudo su
Sudo su@sudoingX·
@startupideaspod that's a lot of duct tape for a problem hermes agent solved at the framework level. persistent memory, session search, daily context. no manual setup. you deserve better tools
English
0
0
8
161
The Startup Ideas Podcast (SIP) 🧃
"Why does my OpenClaw forget everything?" Because nothing was saved in the first place. Here's the 3-layer memory fix: memory.md: - Your agent's long-term brain. - High-level learnings, preferences, insights. - If this file doesn't exist yet, tell your agent to create it. Daily memory folder: - Granular logs created every day. - More detailed than memory.md. - This is where session-level context lives. Compaction flush: - Before your agent summarizes and compresses a long session, force it to write everything to memory first. - Otherwise context gets lost when the window fills up. Then add a 30-minute auto-save heartbeat: - Check if today's memory file exists - Create it if missing - Log a summary of the current session Fix your memory system before you touch anything else. That's where it clicks.
GREG ISENBERG@gregisenberg

THE ULTIMATE GUIDE TO OPENCLAW (1hr free masterclass) 1. fix memory so it compounds add MEMORY.md + daily logs. instruct it to promote important learnings into MEMORY.md because this is what makes it improve over time 2. set up personalization early identity.md, user.md, soul.md. write these properly or everything feels generic. this is what makes it sound like you and understand your world 3. structure your workspace properly most setups break because the foundation is messy. folders, files, and roles need to be clean or everything downstream degrades 4. create a troubleshooting baseline make a separate claude/chatgpt project just for openclaw. download the openclaw docs (context7) and load them in. when things break, it checks docs instead of guessing this alone fixes most issues!! 5. configure models and fallbacks set primary model to GPT 5.4 and add fallbacks across providers. this is what keeps tasks running instead of failing mid-way 6. turn repeat work into skills install summarize skill early. anything you do 2–3 times → turn into a skill. this is how it starts executing real workflows 7. connect tools with clear rules add browser + search (brave api). use managed browser for automation. use chrome relay only when login is neededthis avoids flaky behavior 8. use heartbeat to keep it alive add rules to check memory + cron healthif jobs are stale, force-run themthis prevents silent failures 9. use cron to schedule real work set daily and weekly tasksreports, follow-ups, content workflowsthis is where it starts acting without you 10. lock down security properly move secrets to a separate env file outside workspace. set strict permissions (folder 700, file 600). use allowlists for telegram access. don’t expose your gateway publicly 11. understand what openclaw actually is it’s a system that remembers, acts, and improves. basically, closer to an employee than a tool this ep of @startupideaspod is now out w/ @moritzkremb it's literally a full 1hr free course to take you from from “i installed openclaw”to “this thing is actually working for me” most people are one step away from openclaw working they installed it, they tried it and it didn’t click this ep will make it click all free, no advertisers, i just want to see you build your ideas with ideas with this ultimate guide to openclaw watch

English
9
12
106
11.6K
Sudo su
Sudo su@sudoingX·
@schinsly for sure, on a capable model you can ask and it handles it. the gap shows up when you're running 7B-14B on consumer hardware. those models call tools reliably but can't generate correct python consistently. that's who this is for.
English
1
0
4
120
Schinsly
Schinsly@schinsly·
@sudoingX i guess yeah it hasnt been done before but like i could oneshot that by just asking my agent if it was something i needed
English
2
0
1
121
Sudo su
Sudo su@sudoingX·
@DasMarky99 exactly. one tool, one field, model outputs the expression, hardware computes the answer. that's the whole idea.
English
0
0
0
402
Matu
Matu@DasMarky99·
@sudoingX Wouldn't be enough to expose a new "calc" function to the llm ?
English
1
0
0
453
Sudo su
Sudo su@sudoingX·
you're right, the pieces exist. the question is whether a 9B can use them reliably. code_execution needs the model to generate valid python with correct syntax, imports, and print statements. a calc tool with a one field schema just needs the model to output "847 * 293". the tool computes the result. same math, completely different reliability at 7B-14B scale.
English
2
0
8
659
Schinsly
Schinsly@schinsly·
@sudoingX this isn't super novel imo. the agent can literally just do math in console manually, follow a skill, or call a cli.
English
1
0
3
715
Sudo su
Sudo su@sudoingX·
@drewsky1 can't say which yet. but nothing was removed. only added.
English
0
0
0
21
Sudo su
Sudo su@sudoingX·
What have I done. holly shit this is magic a literal magic.
English
6
0
68
5.6K
Sudo su
Sudo su@sudoingX·
@ArbitorofOZ the greater good is making it so accessible that nobody needs permission to use it. open source everything. let the magic spread.
English
1
1
10
381
TradeVet
TradeVet@ArbitorofOZ·
@sudoingX I feel this so hard. Each of your replies to others also what I’m going through. Looking forward to reading about the good works you will accomplish with the magic you have discovered. My only question, do you feel the urge to apply it for the greater good yet?
English
1
0
0
409
Sudo su
Sudo su@sudoingX·
they are keeping me so compute constrained. I would have double gpus, double ram but hardware prices said no. for now.
English
4
0
29
1.6K