
AgenticRebirth
33 posts

AgenticRebirth
@AgenticRebirth
building with AI, sharing the journey in real time






I wrote up how I built the shitty robot so you can too. This was a fun project that will keep on giving. Thanks to all the open weights folks out there, without whom this would not have been possible. mariozechner.at/posts/2026-05-…



if you have a cursor sub or credits, you can test this now on my fork. github.com/sudoingX/herme…




One of my friends @RowanNorth recently described the AI workflow space as "a human centipede of eating each other’s shit" What do you think 😂



I spent a weekend running numbers on LLM subscriptions versus running models locally. I wanted to find out how to maximise the quality of my LLM usage per dollar. What I found genuinely surprised me. At 5 million tokens a day, which one solid agent workflow with tool calls burns through in a morning: 🔵Claude Opus 4.8 costs about $1,500 a month. 🔵GPT-5.5 runs closer to $1,700. That is $18,000 to $20,000 a year on token bills. A local machine with an RTX 5090 costs about $4,000 to $5,000 and a used RTX 3090 runs $800 to $1,000. Electricity adds maybe $40 a month. The break even point is three to six months depending on which API you choose to use. After year one you are $10,000 to $16,000 ahead. Those are not small numbers. In reality, the gap is even larger, since I use billions of tokens a month. Then there are the intangible benefits like no rate limits, no vendor lock-in, no sending your information to third parties. The "buy a GPU" crowd (h/t @TheAhmadOsman) actually had it right. What actually makes sense is running both, and routing based on the task: 🔵Local models for bulk work: evals, experimentation, batch processing, anything where zero marginal cost changes how freely you iterate. 🔵API models when quality is the actual constraint: customer facing output, complex reasoning, the decisions that cost more to get wrong than to pay for. For maximum efficiency per dollar, you could use DeepSeek V4 Flash for things that do not need a frontier model, and use Claude Opus or GPT-5.5 for the 30 percent that genuinely do. There are only two questions that actually matter: what is your real daily token volume, and what is your quality sensitivity for each category of work you do. Your best setup follows from those. Looks like I'm buying some GPUs.


Just want to make this clear: We didn't make Hermes Agent to be a "starts with nothing, you work it all out" agent. This is not the minimalist, start from nothing, agent. We want Hermes to work out of the box for most people. So you aren't spending weeks just getting the agent to work, or have the capabilities you need. This means that yes, there are more built in things then something like nanoclaw or pi, which start with nothing, and you just have to figure it out. That is an intentional design decision. You can from the modest baseline that has capabilities that are likely broader than you need, but not egregious, take it from there if you want to tinker with it. Run `hermes skills config` or `hermes tools` to disable whatever you want. We even have a way to upload your whole "Agent" as a github repo, so you can install hermes fresh with your exact setup again later or share them. We have a massive interface for extensions so you can tinker with it to infinity. But if you don't want to become an agent engineer - with Hermes, you don't have to.





