Crown 👑

28.4K posts

Crown 👑

@ciruai

Local AI Consulting AI is about the workflow, not the model. AMD Local LLM Group: https://t.co/0wQDCDXlzO

United States Bergabung Mayıs 2009

2.8K Mengikuti7.1K Pengikut

Tweet Disematkan

Crown 👑@ciruai·31 May

@the_jimmy_jones x.com/i/chat/group_j…

QME

1.1K

Crown 👑@ciruai·28m

@SMNYC1 @arunninghacker blows my mind what even a 3GB model knows. Where does it keep it all? 😂

English

bird@SMNYC1·56m

@ciruai @arunninghacker Yes! That is amazing compression right?

English

Volodymyr Styran 🇺🇦@arunninghacker·1d

I ran OpenCode riding an uncensored Qwen3.6 running on a Mini-sized shiny silver box with an NVIDIA card in it for a couple of days, and I assure you: all this ethics/regulations/export control frontier AI drama will be over very, very soon

English

491

62.7K

Crown 👑@ciruai·1h

@SMNYC1 @arunninghacker Imagine power is out for extended period and you're the only house which basically still has the whole internet self contained within it!

English

bird@SMNYC1·1h

@ciruai @arunninghacker I feel like there has to be a market for the people who buy the storage 10 year food kits. They could rebuild civilization after they come out the bunker with ai in a box. Its a neat thought

English

Crown 👑@ciruai·1h

@SMNYC1 @arunninghacker Currently solar powered and running AI locally. Hypothetically if power and internet goes out I could still run AI at home

English

bird@SMNYC1·2h

@ciruai @arunninghacker Its an amazing way to compress a ton of information. Like wow. Having it work durable and offline would be good for if the web ever blows up. Its always a dns issue lol

English

Crown 👑@ciruai·2h

@GMMeyer @benrayfield @BLUECOW009 I have no illusions about someone's ability to just download a weight file. Specifics are going to matter. That's why I say we can't really war game it without the details.

English

Greg Meyer@GMMeyer·2h

@ciruai @benrayfield @BLUECOW009 have you ever worked at an enterprise tech company? i can 100% guarantee almost no one has access to the model itself in the way you’re suggesting and any access of this kind is logged

English

@bluecow 🐮@BLUECOW009·1d

Kinda crazy we never had the weights for a sota model be leaked

English

613

67.5K

Crown 👑@ciruai·2h

I rely heavily on openai still at home, even though I have access to dozens of models running locally and have spent hundreds of hours preparing for local only use. A couple of times in the last month the internet went out for an extended period (someone crashed into a main hub and died 🥲). Had to test everything I had been working on and keep moving without it. Found some important gaps. It's a lot of fun though, to be honest.

English

bird@SMNYC1·2h

@ciruai @arunninghacker Yeah. I'm from the before times when we had our own computers and no internet. Ownership is kinda retro and good sec ops.

English

Crown 👑@ciruai·2h

GPU VMs are much more expensive than just using an API. It's going to cost you $1+ an hour. Adds up if it's always on. An API for chatting probably will only cost you $20 a month or less but at that point why are you not just using gpt anyway? The reason to go local is to be in control.

English

bird@SMNYC1·2h

@ciruai @arunninghacker Cool! Curious why not just host a vm somewhere? With a gpu

English

Crown 👑@ciruai·2h

@GMMeyer @benrayfield @BLUECOW009 Would need to dig into specifics to map out the best way. Kind of hard to say hypothetically. Impossible to stop if people can't be trusted. In the end it's your people who protect you.

English

Greg Meyer@GMMeyer·3h

@ciruai @benrayfield @BLUECOW009 x.com/gmmeyer/status…

Greg Meyer@GMMeyer

@benrayfield @BLUECOW009 okay let’s go back to basics: in any enterprise secure tech is gated and logged, if anyone downloads this it would set off alarms because no one should actually have this on their computer

QME

Crown 👑@ciruai·2h

Can do it on a nice $1000 laptop too, of course. Under $5000 is the same answer as "what does a nice computer cost" I have a nice chat bot on a $150 laptop but thats not going to replace the experience of chatting with chatgpt. You can with better hardware (not including image gen, all the other fancy tools it has now, just chatting unless price gets closer to that $5000 number)

English

bird@SMNYC1·3h

@ciruai @arunninghacker I'm a dev, so I can get it to run. But 5k for a library chat bot seems steep. 🤔

English

Crown 👑@ciruai·5h

@MrPeterLMorris @sudoingX @FrameworkPuter Assume whatever you want, he repeated the invalid claim again, and framework responded to my tweet confirming the issue. So whatever you assumptions are the reality isnt looking good for this lazy influencers credibility.

English

Peter Morris@MrPeterLMorris·5h

@ciruai @sudoingX @FrameworkPuter I assume they gave him info to discuss his findings and how to ask for help, so they look good.

English

Sudo su@sudoingX·1d

the one box i was missing just landed anon. this is the @FrameworkPuter desktop with amd's strix halo, ryzen ai max+ 395, 128gb of unified memory, up to 96 of it addressable as vram. amd and framework sent it over for honest testing, no strings attached, and i've been waiting on this one specifically. here's why it matters. i've run local ai on basically everything, a 150 dollar drawer card, a 3090, a 5090, the dgx spark, datacenter h200s. the one gap was always the accessible big memory tier on the amd side, and this fills it. 128gb unified at roughly half the price of the nvidia equivalent, the sovereignty box for people who want to run real models without a datacenter budget. booting it today. and the question i actually want answered is the one nobody answers straight: what does this thing really run? same bar i hold every other card to. amd, nvidia, apple, measured, never vibes. let's find out what it's got.

Sudo su@sudoingX

listen up ROCm and Vulkan builders. @FrameworkPuter just shipped me strix halo desktop, 128GB unified, landing on my desk tuesday. everyone keeps asking what actually runs on this thing beyond vendor charts and forum guesses. so i'm going to answer it properly. starting with big MoE models since massive total params on light active is the whole point of 128GB unified. if there's a specific model or quant you want tested on strix halo, reply and it goes in the queue.

English

156

28.5K

Crown 👑@ciruai·5h

@MrPeterLMorris @sudoingX @FrameworkPuter it matters in a bad way if they are saying your 128gb system can only handle using 96gb

English

Peter Morris@MrPeterLMorris·5h

@ciruai @sudoingX @FrameworkPuter He has 30k+ followers, that's mostly what matters

English

Crown 👑@ciruai·5h

@GMMeyer @benrayfield @BLUECOW009 You would take it out physically. Obviously.

English

Greg Meyer@GMMeyer·15h

@benrayfield @BLUECOW009 this is not the part that would set off alarms…

English

625

Crown 👑@ciruai·5h

@michal_dolnik @sudoingX Apple doesn't make anything better than it unless you spend twice as much.

English

Michal Dolnik@michal_dolnik·10h

@sudoingX Better than Mac Mini for running a local models ?

English

124

Sudo su@sudoingX·15h

AMD

QST

4.8K

Crown 👑@ciruai·6h

You can own it now it's just a matter of what you expect to be able to do with it. For people who just want to chat where they're already you can have incredibly intelligent chatting AI that does research for you all of that under $5,000 easily. Incredibly intelligent chatbots are solved. Fully agentic AI developing complex code and projects doing full system administration these things take a lot more coordination and intentional workflows in many cases. Still a lot of skill in the driver here so can absolutely be done on local hardware under 5K very well but not out of the box whatsoever. Person that doesn't know what they're doing is going to think it doesn't work at all for that.

English

bird@SMNYC1·8h

@ciruai @arunninghacker I'm not sure yet. I can't think of anything worth building. Mostly interested when I can own the server side in my house. Its a matter of time before commodity models get good enough. Depends on the next few months and who goes under.

English

Crown 👑@ciruai·6h

@MrPeterLMorris @sudoingX Of course you can always spend twice the money for better performance with a much higher power bill. They are great for having good enough performance on fairly large models with high context and very low power.

English

Peter Morris@MrPeterLMorris·9h

@ciruai @sudoingX Both machines too slow for my taste.

English

Crown 👑@ciruai·18h

Someone please ping @sudoingX and tell him to stop saying that AMD Strix Halo can only use 96GB as VRAM. It is a real shared memory pool. Just has to be configured properly. While you're at it show him my 2000 rows of benchmarks since he keeps asking "does anyone have any AMD benchmarks" and ignoring the best answers. llm.ciru.ai

Framework@FrameworkPuter

@barackomaba @sudoingX @JozsefSzalma This is correct :)

English

2.6K

Crown 👑@ciruai·11h

@usr_bin_roygbiv I just asked it what a good new local model is and it said qwen 3.5 122b...

GIF

English

Roy@usr_bin_roygbiv·13h

grok is actually expensive as fuck for how awful it is

English

1.1K

Crown 👑@ciruai·11h

@Tilarium @usr_bin_roygbiv Ok, I trust you. They benchmark really well. I need to have time to spend with it in the harness. Even dominated the Hermes bench which is exactly the kind of tools you'd expect would make it good for that. (The 12b)

English

Alex 🟢 🇮🇱 🇺🇦 🇺🇸@Tilarium·11h

@ciruai @usr_bin_roygbiv I tested this while driving full blown ai assistant (not coding agent). It failed miserably

English

Crown 👑@ciruai·2d

I'm seeing the light. The Gemma models are actually extremely good. The 12b might be even better at Hermes than qwen3.6 35b. My AMD Strix Halo gets 115 TPS+ with 26b QAT MTP New quality tests run : lab.citu.ai #hermes-agent" target="_blank" rel="nofollow noopener">llm.ciru.ai/#hermes-agent @usr_bin_roygbiv

English

176

45.3K

Crown 👑@ciruai·12h

@alexellisuk @arunninghacker I'm not sure, but spark is tiny silver box. It can easily run an uncensored 35b

English

Alex Ellis@alexellisuk·13h

@ciruai @arunninghacker Are you sure that's what he means? Something he said doesn't add up: "with an Nvidia card in it" GB10 is a SoC with unified memory..

English

Crown 👑@ciruai·12h

@itsjustmarky @sudoingX You can reserve as little as .5 gb for vram in bios and leave the rest for unified.

English

sudo rm -rf@itsjustmarky·13h

@ciruai @sudoingX I am able to use over 110g with mine just a skill issue.

English

136

Crown 👑@ciruai·14h

@SMNYC1 @arunninghacker It can be really good, what do you use AI for? llm.ciru.ai/crown-citadel-…

English

bird@SMNYC1·1d

@arunninghacker Mind sharing specs? Curious how good local can be

English

3.2K

Crown 👑@ciruai·14h

@alexellisuk @arunninghacker gb10 , its a spark, igpu

Deutsch

Alex Ellis@alexellisuk·1d

@arunninghacker I also run Qwen 3.6. Curious which box and which Nvidia GPU? Sounded like a Mac Mini, but you didn't say eGPU.

English

4.6K

Jelajahi

@SMNYC1 @arunninghacker @GMMeyer @benrayfield @BLUECOW009 @MrPeterLMorris @sudoingX @FrameworkPuter