Max Milne

65 posts

Max Milne

@maxmilneaus

All my friends are autistic, but i'm not. https://t.co/05DyKKW56F

Australia Katılım Şubat 2026

27 Takip Edilen1 Takipçiler

Max Milne@maxmilneaus·2d

@jameygannon How have you found context switching between new Domain Knowledges of pipelines and architectures? For me it is akin to moving house every 3 months and getting to know the neighbourhood and not building the muscle memory flow states. Looking forward to industry settling.

English

Max Milne@maxmilneaus·2d

@jameygannon I find myself nodding my head “yes.” As I read along. As an advertising photograph pre AI I’ve tried the majors: The claws, the frontier labs, the boutiques and all want to say “it just automates it all away”. I wish. Though I wonder what would be lost if it did?

English

techbimbo@jameygannon·2d

the insane claims AI companies need to make in order to fundraise and maintain their valuations has really bit them in the ass when it comes to consumer sentiment. as a creative, i see this manifesting in two ways 1. they try and do hype marketing that tells people they don’t matter, and that they’re going to be replaced 2. they market their tool as SO EASY that in just a few clicks you’ll have a perfect output…this is almost never the case, and leads to huge frustrations and distrust, even from technically literate demographics honesty and education are going to be extremely important for AI companies if they want to get adoption past the nerds in the X bubble

GREG ISENBERG@gregisenberg

AI has a serious branding problem Probably worse than web3/crypto/NFTs if you ask the average person in the streets, they probably fear and hate AI

English

13.1K

Max Milne@maxmilneaus·2d

Sky AI - the secret sauce in Codex's computer use? sky.app

English

Max Milne@maxmilneaus·3d

@sudoingX @dreamworks2050 M1 Max - 32GB - 24GPU cores Qwen3.6-35B-A3B-oQ4 I'm getting about 35tk/s

English

418

Sudo su@sudoingX·3d

mac users qwen 3.6-35B-A3B hitting 91 tok/s on M4 Max 128gb via mlx in lmstudio, that's solid first numbers from @dreamworks2050. i don't have apple hardware to benchmark myself so i'm counting on the community for mac data. if you're running qwen 3.6 on any M4, M3, or M2 chip drop your tok/s, quant, and app below. especially interested in how mlx compares to llama.cpp on the same chip, same model. the more configs we collect the faster everyone finds their optimal setup

M4rc0z@dreamworks2050

@Alibaba_Qwen QWEN 3.6-35B-A3B MLX FIRST LOOK 👀 91tps @ M4 MAX 128

English

273

45.8K

Max Milne@maxmilneaus·3d

@justusvv @Mosescreates @NousResearch @grok Stealing this prompt.

English

Justus@justusvv·4d

@Mosescreates @NousResearch @grok Please detail and explain like the caveman I apparently am.

English

311

Moshe@Mosescreates·4d

I'm going all in on Hermes (@NousResearch, @Teknium1) as my entire agent and coding stack. Six profiles. One shared self-hosted memory store. Zero hosted-coder dependencies. The fleet: - pmax-mousa — my own WhatsApp + Email + Google Workspace agent - pmax-tarek — my co-founder's Telegram + Email agent - pmax-dareen — our content creator's WhatsApp assistant (LIVE on real client chats) - pmax-content — background content ops - pmax-ops-observer — daily health reports - pmax-coder — my primary coding CLI, no hosted coder, no gateway The model dial — this is the part I'm most excited about: pmax-coder runs on GLM-5.1 native via the Z.AI Coding Plan (@Zai_org, quarterly $45). Direct to api.z.ai, no middleman, no OpenRouter tax. GLM-5.1 published the exact thing I needed — a frontier coder at a flat price I can plan around. I've spent the last three days heads-down just getting the system running. Not tweaking it. Not optimizing it. Getting it to stand up end-to-end without a single load-bearing piece silently falling over. Six profiles, one memory store, two hosts, a dozen services, launchd, Tailscale, native provider pinning, patch re-application, ghost-process recovery, bridge port collisions, FTPS quirks, CI cycles, Qdrant lock contention, Happy Eyeballs hangs — every one of them a real bug I hit and fixed before I could move on. The three days are the story. The five gateway profiles (pmax-mousa, pmax-tarek, pmax-dareen, pmax-content, pmax-ops-observer) all run on qwen/qwen3.6-plus via OpenRouter native Alibaba routing (@Alibaba_Qwen, @OpenRouterAI). I pinned native-only with a strict provider.only patch so nothing silently falls through to a more expensive lane. Offline fallback everywhere is gemma-4-31b-it-4bit served by oMLX on the Mac Studio. If OpenRouter or Z.AI goes sideways mid-conversation, every profile transparently fails over to local MLX inference and the user never notices. Swapping models is one YAML line. The real unlock: unified self-hosted memory. Every Hermes profile reads and writes one mem0 store on my MacBook (Qdrant + Ollama nomic-embed-text embeddings, zero cloud). Claude Code (@claudeai, @AnthropicAI) is wired to auto-broadcast every session turn into the same store via a Stop hook. The direction of flow is Claude writes, Hermes listens. Anything I decide in a Claude Code session is visible to the WhatsApp agents on my very next message. Nothing gets re-explained. Ever. Two-host architecture over Tailscale: MacBook (100.x.x.x) is the service layer. It runs mem0-server on 7437, task_server v1.1.3 on 7439, the guru-code router cache on 7450, the content-review webhook on 7438, all the Claude Code hooks, daily backup cron, and the mem CLI. Mac Studio M4 Max (100.x.x.x) is the agent layer. It runs Hermes v2026.4.13-118, all six profile gateways under launchd, the Hermes dashboard on 9119, the WhatsApp and Telegram bridges, and the Google Workspace OAuth session. Both hosts are pinned to IPv4 over the tailnet because macOS Happy Eyeballs was randomly hanging on IPv6 tailnet paths — one flag on every curl and ssh killed a whole class of flakiness. Huge credit to @brian_cheong — his push on idempotency-on-retries directly shaped task_server v1.1.3 (Idempotency-Key header on every write path), pipeline.py deterministic run_id dedup, and the guru-code router's response cache. Without that, retried agent actions would silently double-fire — a tool call would hit twice, a message would get sent twice, a file would get written twice. Whole classes of bugs I'll never write now. (btw I knew nothing abt idempotency bar — thanks dude) What else ships with the stack: Daily Qdrant and task_server backups with a 14-day rotation, plus a weekly full Hermes zip. Ghost-process immunity on launchd restarts (a startup_guard script kills any zombie api.py holding the Qdrant lock before mem0-server boots). A native-provider pinning patch that wires provider_routing.allow_fallbacks straight through to OpenRouter. A secret redactor that runs on every Claude Code turn end so OpenRouter keys, Anthropic keys, GitHub PATs, and Bearer tokens can never leak into transcripts. A mem audit command that scans the memory store itself for leaked patterns. And a `fleet` one-shot status command I can run from any terminal to get a color-coded snapshot of every service on both hosts plus GitHub Actions status plus the Hermes patch inventory. Over these three days I also pulled 98 commits of upstream Hermes in two passes (70 + 28) without losing a single custom patch. An update-check cron inventories every local patch weekly so nothing regresses silently. Upgrades are safe. That's the invariant I wanted and I finally have it. None of this is a custom AI platform. It's Hermes doing what Hermes does, plus a few surgical patches I kept small enough to re-apply on every upstream pull. The whole philosophy is minimal lock-in: use the upstream as much as possible, patch only the load-bearing seams, never fork. The point isn't that Hermes beats every other coder tool today. The point is it's mine. I own the model dial, the memory store, the tools, the hooks, the backup policy, the security posture, the failover behavior. When something breaks I fix it. When I want to upgrade I upgrade. When I want to swap models I swap models. No middleman. No platform. No rug pull risk. Reports from the field to follow.

English

911

78.8K

Max Milne@maxmilneaus·3d

@justusvv @Mosescreates @NousResearch @grok Much lol.

English

Max Milne@maxmilneaus·3d

@Mosescreates @NousResearch Cool to hear about the shared memory component. I'm curious how to do the same and include OC and CC. I am thinking they all have default memory systems. Can you point me to how you came to land where you did? Thanks for the detailed write up too.

English

Max Milne@maxmilneaus·3d

@sudoingX Can ou help me to understand - is this the launch of Hermes being hosted on MiniMax servers? Or the launch a model adapted for Hermes?

English

102

Sudo su@sudoingX·3d

this is what happens when model teams and harness teams actually talk to each other. minimax co evolving M2.7 with hermes agent's self improving loop. watching the ecosystem grow from small to the moment like these, the compounding is real. if you're still on a generic bloat harness you're missing the network effect that's building here

MiniMax (official)@MiniMax_AI

Capable agents are the result of co-evolution between models and harnesses. We've been working with @NousResearch to ensure that M2.7 x Hermes Agent provides a top-tier experience for users. Hermes’s self-improving loop brings out the best in M2.7 through real usage. We are also launching MaxHermes, a cloud-hosted and managed version of Hermes in @MiniMaxAgent (No terminal setup, no config) If you’re already running Hermes locally, you can now give you agent a partner in the cloud with MaxHermes. The path to AGI is shorter with good company. @NousResearch 🤝 @MiniMax_AI

English

139

10.5K

Max Milne@maxmilneaus·3d

@badlogicgames Computer use is legit. Yesterday I was trying to get Hermes/Openclaw to convert resume.md into a slightly pleasant .pdf using Apple pages. HTML > PDF? Text Edit Doc > Pages? /\ Not great. Codex computer use just did it.

English

374

Mario Zechner@badlogicgames·3d

background computer use is some black magic.

Tibo@thsottiaux

Codex just got a lot more powerful. Computer use, in-app browser, image generation and editing, 90+ new plugins to connect to everything, multi-terminal, SSH into devboxes, thread automations, rich document editing. Learns from experience and proactively suggestions work. And a ton more.

English

250

18.7K

Max Milne@maxmilneaus·4d

@garrytan I am wondering it is able to be plugged into by an openclaw and Hermes agent? So they have a mutual 'memory'. Is that useful?

English

591

Garry Tan@garrytan·4d

GBrain v0.10.0 is a big one My personal OpenClaw setup and brain can now be yours. I've perfected my RESOLVER.md, my SOUL.md and ACLs for multi-user brain access. Now there are 24 distinct fat skills with fat code, fully tested with e2e tests, evals and unit tests.

English

108

146

1.7K

167.9K

Max Milne@maxmilneaus·5d

@NeoAIForecast Mixture of Agents < have you enabled this in the the Toolsets?

English

Neo@NeoAIForecast·13 Nis

x.com/i/article/2043…

ZXX

950

91.1K

Max Milne@maxmilneaus·6d

@farzyness Yeah I asked the both to do the same task....Hermes always for the win....and automatic Skills making is brilliant for my use case - Creative Output.

English

220

Farzad 🇺🇸 🇮🇷@farzyness·6d

My super early impressions of OpenClaw vs Hermes Agent: Hermes seems WAY more reliable at executing actual tasks - even on GPT 5.4. It also feels way more stable. And I absolute love that it shows which tools it's calling as it's executing a task. I also really like that the personality with GPT 5.4 is FAR better on Hermes as well - after a bit of tweaking. With OpenClaw, I was finding it impossible to get GPT 5.4 to stop talking like a sycophantic idiot. On Hermes, I can get it to be direct & push back with little effort. I'm also finding that GPT 5.4 is FAR more reliable on Hermes vs OpenClaw (thanks @heyitsyashu for the tip). It really does feel like Opus 4.6 level performance on OpenClaw, but with even better execution on long-running tasks. Not sure what the Hermes team has done (I'm not technical at all), but it's obvious that the way they've constructed the back-end is far easier for LLMs to figure out what they should be doing. Because of OpenAI oauth being allowed, Hermes + GPT 5.4 os now EASILY the best intelligence per $ 'AI brain' for Agents. I think the @openclaw team needs to deeply study @NousResearch and what they've done because it can likely benefit MASSIVELY. Gut tells me OpenClaw has become FAR too bloated and FAR too 'jack of all trades, master of none'. When it comes to actual execution of tasks, Hermes feels WAY better equipped, and TBH I wouldn't be surprised AT ALL if it becomes adopted by actual businesses/operators at a far greater rate than OpenClaw. I'm also starting to really worry that the days of OAUTH with 3rd party tools is coming to a screeching halt very soon. I don't think OpenAI is going to allow their oauth tokens to be used on an AI agent competitor when they've invested a lot of money on OpenClaw's creator. What I think is gonna end up happening is OpenAI will stop allowing oauth use for 3rd party apps all together (including OpenClaw) and they'll likely release their own OpenClaw v2 to try and compete against Hermes/Computer/CC. Hope I'm wrong, but these tools are far too powerful to be "allowed" to be open source + heavily subsidized tokens, especially as the public outcry re: AI's cost to electricity continues intensifying. But this will give way for ultra-capable, ultra-efficient opensource models that will give 90%+ Opus 4.6/GPT 5.4 performance for 1/10th of the cost that are SPECIALIZED for agentic harnesses like OpenClaw/Hermes. It's starting to get REALLY interesting, folks.

English

475

61.9K

Max Milne@maxmilneaus·6d

@0xShayan @Teknium I'm driving Fire Pass - Kimi K2.5 Turbo (unlimited) + ChatGPT 5.4 Is a good mix of intelligence I'm finding.

English

Shayan@0xShayan·6d

Is anyone using Hermes via the GLM 5.1 Pro/Max plan and if so, how is your experience? Tagging @Teknium for vis

English

14.3K

Max Milne@maxmilneaus·6d

@0xShayan @IEatCodeDaily @Teknium Hermes supports dedicated auxiliary models for eight task types: 1. vision 2. web_extract 3. compression 4. session_search 5. approval 6. skills_hub 7. mcp 8. flush_memories You can nominate one for each. And doesn't the Plans come up MCP support for images?

English

151

Shayan@0xShayan·6d

@IEatCodeDaily @Teknium Ah. How do you get around this then?

English

929

Max Milne@maxmilneaus·6d

@NousResearch I love this. Checking out the dashboard. As noob user I wonder if adding a beginners mode? That would look like a toggle that has a little (i) next to the fields that explains what these things do in the bigger picture. Or even a pop up of Hermes to help you run through.

English

1.6K

Nous Research@NousResearch·6d

Hermes Agent v0.9.0 - “The Everywhere Release” Full changelog below ↓

English

216

248

2.8K

2.8M

Max Milne@maxmilneaus·6d

@kieranklaassen @CoraComputer And I was thinking another reframe could be that English is the programming language. Two Agents speak with each other in a Telegram window. English. That is how they make requests and share data. Until they create their own language.

English

483

Kieran Klaassen@kieranklaassen·6d

Everyone's building mega-swarm systems. I just realized: a folder with a CLAUDE.md is already an agent. For @CoraComputer I have a source folder, a customer support folder, a bug investigation folder. Each is an agent. New discipline? New folder. No lock-in, no dependency. Orchestration is just one layer that spawns across folders. Build brick by brick first. Full article on Every →

Every 📧@every

And get the full piece from Kieran on running 44 AI agents across multiple projects: every.to/source-code/th…

English

243

80.7K

Max Milne@maxmilneaus·6d

So for LLMs and Agents....English is the programming language. Two Agents speak with each other in a Telegram window. English. That is how they make requests and share data. English. Just learn how English works as a way to describe structures and patterns rather than TS?

English

Max Milne@maxmilneaus·6d

@NeoAIForecast So Claude has Agent teams where the orchestrator fires up a team lead and then other agents that all work and communicate together. Do you have a system similar to this in Hermes, or is it possible or considered on the time?

English

187

Max Milne@maxmilneaus·6d

@nicopreme @DanielGri Agent teams are go then!

English

336

Nico Bailon@nicopreme·6d

pi-subagents updated so that if pi-intercom is also installed then subagents can ask the orchestrator for help and get replies back in real-time. Subagents can have a full two-way chat with the orchestrator. Credit to @DanielGri for the inspo. pi install npm:pi-subagents github.com/nicobailon/pi-… pi install npm:pi-intercom github.com/nicobailon/pi-…

Nico Bailon@nicopreme

New in pi-subagents: fully customizable best-of-N workflows. Run the same task across different models in separate worktrees in parallel, automatically pass results to parallel reviewers, then apply the winner to your branch. Create your own "best-of-N" prompt template (requires pi-prompt-template-model), or just ask Pi to run with parallel subagents in worktree mode. pi install npm:pi-subagents github.com/nicobailon/pi-… pi install npm:pi-prompt-template-model github.com/nicobailon/pi-…

English

220

22.6K

Max Milne@maxmilneaus·12 Nis

@muhraufan Super cute!

English

134

Raufan 🍉@muhraufan·12 Nis

A few years ago I did a solo photo exhibition & one thing that really memorable to me was the guestbook. People leaving notes, greetings, cute messages. I wanted to bring that feeling to my personal website! So I made a guest card. If you're visiting, you can leave one ✨

English

309

12.4K

Keşfet

@jameygannon @sudoingX @dreamworks2050 @justusvv @Mosescreates @NousResearch @grok @Zai_org