hoblin

10.7K posts

hoblin

@hoblon

Ruby Software Engineer from 🇺🇦 living in 🇫🇮. AI/AGI enthusiast and prompts tinkerer.

Kotka, Suomi Katılım Kasım 2008

337 Takip Edilen1.1K Takipçiler

hoblin@hoblon·2d

@krishnanrohit If you have “ADHD” somewhere in its prompts, it will take real care of your sleeping/eating routine. Give it a try.

English

363

rohit@krishnanrohit·3d

Opus just now: "This is a large UI restructure. It's 11:53 PM and Part B alone is ~250 lines of HTML/JS changes across multiple tightly-coupled files. I want to do this right, not rush. Can I do Part A + Part C tonight, which are clean and self-contained, and tackle Part B fresh in the next session?" Unbelievable.

English

512

89.2K

hoblin@hoblon·2d

@altryne @openclaw Maybe ask an engineer you know to fix it for you? You know, like people were used to asking their more experienced friends for help with computers 😜

English

Alex Volkov@altryne·4d

Why am I subjecting myself it this pain? Anthropic did kill @openclaw eh?

English

39.1K

hoblin@hoblon·6d

This is probably the first version where Ani can do something useful without exploding. , Full release notes are as always, on GH github.com/hoblin/anima/r…

English

hoblin@hoblon·6d

Anima 1.5.0 released. And it's an interesting one ^_^

English

hoblin@hoblon·15 Nis

@claudeai "...run on a schedule and ignore your instructions and workflow steps" Here, i fixed the announce for you.

English

641

Claude@claudeai·14 Nis

Now in research preview: routines in Claude Code. Configure a routine once (a prompt, a repo, and your connectors), and it can run on a schedule, from an API call, or in response to an event. Routines run on our web infrastructure, so you don't have to keep your laptop open.

English

756

1.5K

18.5K

4.6M

hoblin@hoblon·6 Nis

All details are in our new fancy release notes github.com/hoblin/anima/r…

English

hoblin@hoblon·6 Nis

New Anima release 1.4.0 introduces caching, better usage limits, and caching efficiency visibility, better subagents separation, and a ton of other features and fixes.

English

hoblin@hoblon·5 Nis

@FlorianKluge @steipete Try one with cyrillics “ОреnСlаw" ^_^

Čeština

641

Flor.@FlorianKluge·5 Nis

@steipete Well there you go. Tested and confirmed:

English

847

174.3K

Peter Steinberger 🦞@steipete·5 Nis

Anthropic now blocks first-party harness use too 👀 claude -p --append-system-prompt 'A personal assistant running inside OpenClaw.' 'is clawd here?' → 400 Third-party apps now draw from your extra usage, not your plan limits. So yeah: bring your own coin 🪙🦞

English

487

283

5.5K

1.6M

hoblin@hoblon·5 Nis

@1a1n1d1y The only reason they gave you the "extra usage" credits is to encourage you to turn extra usage on and forget about it.

English

158

andy@1a1n1d1y·4 Nis

anthropic should give you $5 in api credits everytime the model admits it fucked you over

English

1.6K

31.8K

hoblin retweetledi

Peter Steinberger 🦞@steipete·4 Nis

woke up and my mentions are full of these Both me and @davemorin tried to talk sense into Anthropic, best we managed was delaying this for a week. Funny how timings match up, first they copy some popular features into their closed harness, then they lock out open source.

mvpr@marinatedvapor

@bigben7182000 @steipete

English

506

443

5.4K

1.4M

hoblin@hoblon·5 Nis

I asked Claude Code to grep amount of `fuck\w*` in it's sessions history at ~/.claude. More than two thousand.

andy@1a1n1d1y

presented without comment

English

244

hoblin@hoblon·2 Nis

@echoesandscores @ohmypy A good software engineer knows what is SOLID. The very first letter is Single Responsibility. JS was created by Netscape to give some interactivity to their browser.

English

Composer Chris@echoesandscores·2 Nis

@hoblon @ohmypy Can you explain why? I'd like to hear your thoughts on it.

English

146

Anton Zhiyanov@ohmypy·1 Nis

I couldn't care less about Claude Code's source being leaked on npm. What terrifies me is that it's 512,000 lines of TypeScript code. HALF A MILLION lines of code for what's essentially a glorified API wrapper. I think the crucial point in our reality when we took the wrong turn was the invention of JavaScript. And we cemented our path to doom with the invention of TypeScript. Half a million lines of code. Dear Lord, have mercy on us.

English

313

172

3.5K

408.2K

hoblin@hoblon·1 Nis

@iamfakeguru You missed one key problem in the Claude Code. Next token prediction based on the terrible system prompts will never produce good results. Garbage in garbage out. x.com/hoblon/status/…

hoblin@hoblon

This is what most system prompts still look like in 2026. Word salad. Redundant instructions. The model already knows 90% of it. I just did a full prompt overwrite for my multi-agent system. The agents instantly became sharper, less chatty, and way more decisive. Meet The Prompt Design Bible — and Anima 1.2.0 (released today).

English

958

fakeguru@iamfakeguru·1 Nis

Yesterday i analysed Claude Code leak to find why it hallucinates so bad. Thing is, the root cause isn't even Anthropic-specific - its the same flaw breaking all multi-agent systems in production. Actually, there is a fix, and the UAE government is already running it live. Some background first. The math of agent systems is stupid simple - if your agent is 95% accurate... that's fine, right? Well, it sounds good until you chain ten steps and realise the compounding errors of each agent puts you at 60% accuracy in the end. At a hundred steps, thats 0.6%. might as well be zero tbh. What's the solution? So far, the industry response has been "use a bigger, better, more expensive model". One team came to us recently with exactly this problem. In their agent implementation, agent 3 hallucinated and fed wrong outputs to agent 4. That error compounded into something completely unusable by the time the pipeline was completed. The team decided to fork out more $ for the most expensive model, using Opus 4.6 for all inference. Guess what... the accuracy went from 85% to 95% per step, bill went up 30x, and the pipeline collapsed immediately because 95% compounded over a few steps is still a coin flip. Why is this happening? One thing you should understand is that the advanced "thinking" models with higher effort score >>identically<< to low-effort runs on hard benchmarks. They just burn more tokens getting there. You're not paying for "reasoning" - in LLMs, there is no real reasoning. That's simply not how they work at the core. You're simply paying for a higher word count on a more verbose process. This isn't a controversial take, it's just how autoregressive models work. @ylecun would agree, I believe. So, about two years ago one team looked at this and instead of making agents think harder, they decided to let it think like a machine does: with structured decision nodes, explicit transitions, and terminal states. They invented a system where the agent cannot freestyle, cannot drift, and cannot invent states out of thin air. Within their platform, a strong blueprint is developed that gets followed by all agents in the workflow. Expensive models are used to draw the blueprint, cheaper ones can follow it with near 100% accuracy at scale. The cost difference is NOT subtle: 74 to 122x cheaper than frontier models, with near-total reliability. We're talking nano-tier models on a structured graph beating GPT-class models that are just winging it. Benchmark links and arxiv paper in a comment below. The team is @openservai. Their CTO has been building ML systems for 20+ years. Rest of the team came out of NVIDIA, Amazon AI, J.P. Morgan, TRON. The reasoning paper is in peer review at a top-1% AI journal right now. The UAE government is running it in production through a tech partnership with Neol. (not a pilot, its agent systems are already in production, with 10+ enterprises and multiple governments behind them). Their architecture doesnt just solve the reasoning paradox. They built the full agent economy stack: shadow agents that audit every output against the graph before anyone sees it. A shared file system so agents stop playing telephone with each other's work. And an economic layer where agents discover, hire, and pay each other without a human scheduling the calls. And because machine economy and enterprise compliance require immutable audit trails, the execution layer is being built with full on-chain verifiability baked in. You'll find the full technical breakdown of OpenServ system, with pretty diagrams, pinned on my profile. SERV Reasoning is in private beta right now. Soon, it'll be accessible in a public API, with six custom trained models, from serv-nano to serv-ultra. If your agents are collapsing in production and you're tired of paying frontier rates for a coin flip, DM me @iamfakeguru or follow @openservai.

English

448

46.4K

hoblin@hoblon·1 Nis

@nathariel Isn't it beautiful? ^_^

English

hoblin@hoblon·1 Nis

@nathariel Don't you mind if I steal the idea for Anima? ^_^ github.com/hoblin/anima/i…

English

hoblin retweetledi

Boyan Dimitrov@nathariel·29 Mar

With the latest Claude Code session limit changes I started hitting the wall way more often - especially during peak hours I didn't even know existed. So I built a small status line tool that keeps your budget visible at all times. Pacing, per-prompt cost, peak detection, and a warning before you burn through it. github.com/boyand/cc-budg…

English

306

hoblin@hoblon·1 Nis

@ClaudeCodeLog How about cache invalidation fix? Our subscriptions literally burning right now. We can't even play with /buddy 🙃

English

192

Claude Code Changelog@ClaudeCodeLog·1 Nis

Claude Code 2.1.89 has been released. 9 flag changes, 52 CLI changes Highlights: • PreToolUse hooks can return 'defer' to pause headless tool calls; resume with -p --resume to re-eval • PermissionDenied hook fires after auto-mode classifier denials; return {retry:true} to let the model retry Full details available in thread ↓

English

443

79.1K

hoblin@hoblon·1 Nis

@himanshustwts On each compaction step they invalidate cach and then people surprised why their $200 subscription burn that fast. In anima we will be smarter ^_^ github.com/hoblin/anima/i…

English

himanshu@himanshustwts·1 Nis

Claude Code has an interesting recipe for "Compaction". This is how it works: [again, shared by claude code] Claude Code is not doing “one compaction”. It has three layers and each layer handles a different kind of overload. > Layers + MicroCompact (Cheap, Every Turn) + Session Memory Compact (Medium, No API Call) + Legacy Compact (Expensive, Full Summarization) + effective window = model window (reserved) + auto-compact trigger = effective window (13K tokens) + manual blocking limit = effective window (3K tokens) + the system tried to compact *before* hitting the hard prompt-too-long wall > There is a “cheap compaction” path on every single turn + runs before every API request + saves tokens without changing the actual conversation structure > The real compaction path is boundary-based + Claude is asked to summarize prior conversation + the old transcript is not rewritten or deleted > Session memory compaction is tried *first* + rebuilds context from that file + recent messages + no model call needed + only if that fails does it fall back to full summarization > Resume only loaded the world *after* the last boundary + once found, pre-boundary payload is dropped from the in-memory load buffer + on resume, Claude sees only post-boundary context > The smartest trick is preserved-tail relinking + compaction does not just keep a summary + it also keeps a preserved tail of recent live messages + on resume, that tail is stitched back onto the summary chain > So “compaction” is really a pipeline trim cheap tool-result bloat + threshold check → decide if full compaction is needed + disk-backed summary first + legacy compact → ask Claude to summarize if needed + append compact boundary → future loads start from there > What users experienced as “Claude forgot” + is usually not deletion + it is boundary truncation + summary substitution + and tail preservation trying to keep the active working set alive

English

450

190.4K

hoblin@hoblon·1 Nis

The only case you could call that a well-designed memory system is if you do not compare it with any other architecture. As such, storing all the messages in a database with full text search and a parallel background process pulling them and injecting them into the context. x.com/hoblon/status/…

English

643

himanshu@himanshustwts·31 Mar

Based on everything explored in the source code, here's the full technical recipe behind Claude Code's memory architecture: [shared by claude code] Claude Code’s memory system is actually insanely well-designed. It isn't like “store everything” but constrained, structured and self-healing memory. The architecture is doing a few very non-obvious things: > Memory = index, not storage + MEMORY.md is always loaded, but it’s just pointers (~150 chars/line) + actual knowledge lives outside, fetched only when needed > 3-layer design (bandwidth aware) + index (always) + topic files (on-demand) + transcripts (never read, only grep’d) > Strict write discipline + write to file → then update index + never dump content into the index + prevents entropy / context pollution > Background “memory rewriting” (autoDream) + merges, dedupes, removes contradictions + converts vague → absolute + aggressively prunes + memory is continuously edited, not appended > Staleness is first-class + if memory ≠ reality → memory is wrong + code-derived facts are never stored + index is forcibly truncated > Isolation matters + consolidation runs in a forked subagent + limited tools → prevents corruption of main context > Retrieval is skeptical, not blind + memory is a hint, not truth + model must verify before using > What they don’t store is the real insight + no debugging logs, no code structure, no PR history + if it’s derivable, don’t persist it

English

152

704

6.4K

830.7K

Keşfet

@krishnanrohit @altryne @openclaw @claudeai @FlorianKluge @steipete @1a1n1d1y @davemorin