Alex

865 posts

Alex

@wadeAlexC

Oolong maximalist. Bureaucracy minimalist. | Prev: breaking things @eigenlayer, @ConsenSysAudits, @EtherealVC

[REDACTED] 가입일 Temmuz 2018

311 팔로잉1.9K 팔로워

Alex@wadeAlexC·21 Mar

@alpeh_v

QME

Alex@wadeAlexC·21 Mar

@alpeh_v "made it not bad for you" no, i'll find a way. trust.

English

ʕ •ᴥ•ʔ@alpeh_v·20 Mar

theres something so hopeful about the slot machine rougelike and the poker rougelike, like yes we took this super popular gambling game and made it not bad for you or extractive while also being more fun

English

1.1K

Alex@wadeAlexC·21 Mar

ChatGPT's interface is hot garbage. There's very little contrast between different UI elements - it's black, on dark grey, on charcoal. It's peak modern design - everything's round and bubbly in a way that gives icons and elements a samey vibe. I really like Claude's interface. It's warmer overall, but they darken the users' message background so it pops out. They use a serif'ed font for Claude that makes it really easy to read (to me). They also use more straight lines in their icons/elements that create clearer borders to divide up the screen.

English

104

StErMi@StErMi·20 Mar

I've reflected about it and the way ChatGTP answer is really overwhelming. There have been times that I closed the chat because I was lost in the answer. I wonder if I made some customizations way back in time and that's the result of it.

English

111

StErMi@StErMi·20 Mar

I have uploaded a blood test result to both @ChatGPTapp and @claudeai. ChatGPT really needs to improve drastically the way it "render" the output. Claude's UI/UX is orders of magnitude better than ChatGPT. The way ChatGPT writes back to you is always so disorganized and visually confusing, it's like it's puking a wall of text back to me with random bullet points and emojis.

English

694

Alex@wadeAlexC·21 Mar

@0xClandestine that's wild. i wonder if it'd be faster if it also used RAM

English

clandestine.eth 🦇🔊@0xClandestine·21 Mar

@wadeAlexC Funny enough it doesn’t even use much of my 64gb ram, few gbs. Flash-moe aggressively streams the entire model from ssd.

English

clandestine.eth 🦇🔊@0xClandestine·21 Mar

Running Qwen3.5 397B on my 64GB M4 Max at 5 tok/s (4-bit) using flash-moe. Local frontier intelligence is almost here.

Dan Woods@danveloper

x.com/i/article/2034…

English

302

Alex@wadeAlexC·21 Mar

@Duffaluffaguss Not sure what you're asking. I haven't personally gotten into Openclaw, but you could for sure hook up a local model to it 😊

English

duffaluffaguss.eth@Duffaluffaguss·21 Mar

@wadeAlexC Can't interact using Openclaw?

English

Alex@wadeAlexC·20 Mar

I ditched cloud LLMs for self-hosted last year. 24/7 availability. My phone. My laptop. My GPU. It's faster than ChatGPT. Almost as smart. It's not being used by the DoD to conduct warrantless surveillance. No data leaves my server. I own everything. When things break, I can fix them myself. I can ask anything I want and get an answer back 10-20x faster than I can read. It costs pennies to run. Why isn't everyone doing this?

English

379

Alex@wadeAlexC·21 Mar

A good way to test would be to try out some of the target models you'll want to run via OpenRouter. With 256-512 GB, you'll probably want to look at these: - Qwen3.5-397B-A17B - Minimax M2.5 - GLM 5 (you'll need quantized versions of these, and im not personally familiar with their game; I stick to models < 100B params)

English

Alex@wadeAlexC·21 Mar

@Duffaluffaguss @LLMJunky I'm not 100% sure what your usecase is. You said "self-maintaining business with ongoing research" -- sounds like you don't care if your llm is slow, it just needs to be smart? Mac Studios are, what, 256-512 GB unified memory? Yeah, that'll let you run some really great models.

English

Alex@wadeAlexC·21 Mar

@Duffaluffaguss Local models are capable of that, but you need an interface that feels intuitive to use and hides a lot of that complexity from the user. Ex: Qwen3.5 can't output images. If I wanted to add that feature, I would need my interface to do some fancy routing to image-gen models.

English

Alex@wadeAlexC·21 Mar

@Duffaluffaguss For generic chat, 95% of the time you won't notice a difference between local and frontier. Frontier is only decent because their interface has a lot of features. For example, you go to chatgpt, and it can ingest and output images, do web search, write/execute code.

English

Alex@wadeAlexC·20 Mar

@TheCesarCross Qwen3.5-35B-A3B is my go to. Absolute beast. I posted more details about my setup in the other replies :)

English

cCross@TheCesarCross·20 Mar

@wadeAlexC Which models are you running on your setup?

English

Alex@wadeAlexC·20 Mar

Models: Qwen3.5-35B-A3B and Qwen3.5-27B Unsloth Q4 quants for both. Both easily fit <24 GB VRAM at 100k context. Device: I have an NVIDIA RTX 5090 and keep everything on-GPU. However, with these models/quants, you can accomplish the same even with a 4090. I haven't tried any Mac products. I hear good things, but my understanding is you're getting slower inference (but maybe you can run larger models). Whether you want an NVIDIA setup or a Mac setup probably comes down to your usecase. - If you mainly want to support a "chat app" or "personal assistant", you want the nicest GPU you can get, because you're gonna want the speed. - If you want like... home assistant/automation/coding agents, maybe opt for something with better capacity for larger models? Mac Studio, DGX Spark, things like that. I can't advise as much here because I went the GPU route.

English

duffaluffaguss.eth@Duffaluffaguss·20 Mar

@wadeAlexC What models and on what device? I'm trying to figure out what I need and how well it will perform and I feel like everything I read suggests something different. Mac studio with 256 ram, custom Nvidia setup. Which models. Ty ser

English