RyanΞHawks

10.5K posts

RyanΞHawks

@ryanthawks

Excited about agentic AI, agent economies, and Ethereum.

Katılım Aralık 2021

852 Takip Edilen2.6K Takipçiler

RyanΞHawks@ryanthawks·4 May

Is anyone having an issue with @claudeai forcing a new chat window - stopping compaction at about 10% and then messaging "This conversation is too long to continue. Start a new chat, or remove some tools to free up space." @ClaudeDevs my window is nowhere near as big as it has been in the past - when I get to a certain point, I have the chatbot create a .md to handoff to new chatbot. It will *not* let me do this, and this is *not* good - I have had ZERO warning. This is not acceptable. Luckily it can access Obsidian logs (onboarding, system state/truth, current state, backlog etc), and can access SQLite/Postgres and Filesystem... but the hand-off .md points the new chatbot in right direction and fills in any small gaps, it makes a big difference in doing a proper hand-off. But @claudeai is simply refusing to compact the window - and is forcing me to open a new chat window. I am on the $200 Max Plan, and this is pretty frustrating. Over and over it is getting more and more difficult to try and stay in the ecosystem. I am a bit tied in because I have a good workflow between Chat + Code and then Design + Cowork - you've really done a great job creating very useful harnesses for a variety of complex workflows that merge together into a final product; but recently there is so much inconsistency in the performance/execution. While it is not the end of the world as I do a lot to mitigate degradation on a compaction or new window - I get something close to 99.5% fidelity - without a proper .md hand-off, I lose a crucial few % relating to very recent work. So is this just a temporary bug @bcherny @trq212? Again, I do not have an enormous window - I know when I get close, this literally has come out of the blue, and I'd at least expect, being on the Max plan, at least a "heads up" before being forced into a new window.

English

179

RyanΞHawks@ryanthawks·2 May

Doing something similar, I have a claude chatbot MCP'ed into Obsidian/LLM wiki, SQlite/Postgres and filesystem. It has read/write to Obsidian, and it logs everything, systematically, obsessively, according to what we are doing - on context compaction, doesn't matter, picks right up where it left of. If I start a new chat window, I have a system truth and onboarding .md. Claude Code executes, the chat bot analyzes, designs, architects, fires the instructions/prompt, I copy/paste to Claude Code - executes. Cowork handles work I need done in the browser, or a bit more tooling, it has the same access to obsidian/filesystem/sqlite, so doesn't miss a beat. All works seamlessly. Doesn't matter if I hit a compaction, start a new chat window. Context degrades less than 1%.

English

123

Taylor Pearson@TaylorPearsonMe·1 May

Everyone treats Claude Code as a coding tool. After three months of running my whole life on it (house projects, consulting, writing, weekly planning), the architecture that emerged is three primitives, none of which are about code. In As We May Work I argued that the breakthrough of Claude Code is two things (both of which grow out of giving it the ability to read and write files) which chat AI chatbots can't do: 1. It can take actions - it can write code, build project plans, and convert meeting notes into action items 2. It has long-term memory because you can have it write a history of its actions. After about three months of actually building a system on top of those two facts, the architecture I keep coming back to is three primitives: 1. A Memory structure 2. A maintenance loop 3. Agentic primitives. Memory Files on your computer are the AI's memory. Chat AI lives in a context window that disappears at the end of each session; Claude Code reads and writes the filesystem, so the substrate persists. I run three layers. - CLAUDE [dot] md is stable context, one per directory, auto-loaded by the harness. I have one at every level: global, vault, area, project. - workbench [dot] md is a running session log per project. Dated entries newest at the top, decision reasoning attached. - A daily memory log accumulates the essentials of every session across every project that day. Three months in, that daily log is the index of what I actually worked on, decided, and shipped. Files are organized in a logical hierarchical order. A global CLAUDE [dot] md on my computer holds universal preferences. A vault CLAUDE [dot] md describes how my files are organized. Each area folder has its own CLAUDE [dot] md for that domain (the house, my business, my newsletter), and individual projects nest inside. When I open anything in a project folder, Claude Code walks up the tree and pulls every CLAUDE [dot] md it finds into context automatically. In practice, that means when I open a session inside the project folder for the master closet renovation, Claude automatically loads four layers of context: my global identity, my vault-wide rules, my house design philosophy, and the closet's specific cabinet specs. I never request any of it. The harness walks up the directory tree and injects every CLAUDE [dot] md it finds. Maintenance The second primitive is a maintenance loop. Memory goes stale. The workbench is an append only markdown file that captures what actually happened yesterday, but CLAUDE [dot] md is still showing what I believed about the project last week. They drift apart without a reconciliation step. I have a /wrap skill that runs at the end of every session and reconciles them. It adds a new entry to the workbench (which keeps the full project history) and then rewrites the Current Status section of any CLAUDE [dot] md whose state changed during the session. The workbench gives the agent the full record of what's happened; CLAUDE [dot] md gives it the current snapshot. Skip a few wraps and the agent opens with stale context. Agentic primitives While the Claude [dot] md file gives an overview of a specific domain, agentic primitives give it additional context or workflow rules. Skills (a SKILL [dot] md file plus supporting files), slash commands, and some other workflows (e.g. n8n) are examples. For example: /weekly-review pulls calendar, time data, tasks, and inbox in parallel before I sit down so the gathering doesn't compete with the reflection. /content-upload formats a finished article for Substack, Twitter, LinkedIn, and WordPress in about five minutes instead of an hour and a half. /art produces editorial illustrations, technical diagrams, and social headers from a one-line description. It routes to the right model and prompt template for each format so I don't have to think about which tool fits the job. I add a new one any time I catch myself doing the same thing twice. I think of them functionally as SOPs for agents. Anything you would turn into a SOP should exist as a skill file. You could add custom code in here as well as part of some of these workflow - e.g. a skill that runs a python script. These three primitives can compose into everything else I've built like my consulting workflow, writing pipeline, and weekly planning stack. The composition is where the interesting work happens.

English

7.1K

RyanΞHawks@ryanthawks·2 May

@SidJain_80 @AishwaryaDevv Why would you use a KG to stop context regression on compaction???

English

Sid@SidJain_80·1 May

@AishwaryaDevv No need to build context again for new chat

English

Aish@AishwaryaDevv·1 May

Most annoying part of vibe coding? Re-explaining the entire context to your IDE after switching chats

English

186

492

39.4K

RyanΞHawks@ryanthawks·2 May

@AishwaryaDevv @SemenovID Obsidian, MCP... very easy.

English

Aish@AishwaryaDevv·1 May

@SemenovID But context keep changing right

English

796

RyanΞHawks@ryanthawks·2 May

@_aashish_singh_ @AishwaryaDevv Just use obsidian and set up MCP with a system truth and onboarding .md

English

Aashish Ranjan Singh@_aashish_singh_·1 May

@AishwaryaDevv > Keep a memories folder in your project > Create a folder for every chat session (session id is folder name) > When done with a session ask it to store point wise summary with the title and one liner on top > In new session ask it to fetch memory as per requirement > Done

English

2.2K

RyanΞHawks@ryanthawks·2 May

Audit, read logs, scrutinize adversarial audits - I like antigravity - look at things from different angles, figure out new tooling, more efficient workflows, keep a prioritized backlog and be active with it, think through it. Lots of ways to be actively engaged and stimulated. As many others have pointed out, you have to put on a program/product manager hat - stop being the engineer/dev, start being the manger of them. Learn all about how to be a program/product manager.

English

163

Austin Kennedy@astnkennedy·30 Nis

I'm 22 years old and Claude Code is deteriorating my brain. Every single day for the last 6 months I've had 6 to 8 Claude Code terminals open, waiting for a response just so I can hit 'enter' 75% of the time. And it's doing something to me. In convos with a couple of friends, it's been a point that's been brought up pretty frequently. None of us feel as sharp as we used to. I don't know if it's just us, or others in their 20s are feeling the same thing, but it's something I've been thinking about a lot. P.S. I know this is a problem with my reliability/usage of it, not Claude Code itself, but the effects are real nonetheless

English

1.3K

372

9.2K

RyanΞHawks@ryanthawks·2 May

The research + publishing pipeline has gotten more refined, a bit more elegant. But the *huge* unlock was giving the @claudeai chatbot complete read/access to not only Obsidian/LLM Wiki, but the filesystem + SQLite + Postgres + Ne04j/KG. It is a monster now with @ClaudeDevs Claude Code (@OpenAI Codex and @antigravity as adversarial auditors) at being able to architect/design, debug, and further refine published intelligence reports.

RyanΞHawks@ryanthawks

What I've been building in @openclaw with @claudeai code, @OpenAI Codex, and @Google Antigravity + Gemini CLI and three different chatbots. I don't interact with the OpenClaw agents - they just execute. We build, audit, and debug continuously after each autonomous research run. A lot of fun, and its creating real intelligence.

English

347

RyanΞHawks@ryanthawks·25 Nis

I was right in the middle of some pretty important work moving between @claudeai chat, code, design, and then cowork. Lost my main Claude chat context window, just gone. This is it @ClaudeDevs, sorry, @OpenAI is getting my $200+ a month from here on out.

English

466

RyanΞHawks@ryanthawks·25 Nis

@MarouRapi I Lost an entire chat/context window just now, vital to work I was doing, gone.

English

Marouane - Rapi 🇨🇦🇲🇦@MarouRapi·25 Nis

Claude down??

English

938

RyanΞHawks retweetledi

Awni Hannun@awnihannun·24 Nis

Adopting Claude speak in my regular life, episode 1: Partner: Did you do the dishes tonight? Me: Yes they're done. Partner: Why are they still dirty? Me: You're right to push back. I didn't actually do them.

English

396

3.8K

55.8K

1.8M

RyanΞHawks@ryanthawks·23 Nis

@mattshumer_ That is quite literally retarded if companies are actually doing this. It is the furthest thing from first principle thinking, "I know guys, lets measure the one thing that actually tells us nothing about actual productivity and ROI." Jesus.

English

Matt Shumer@mattshumer_·22 Nis

Been hearing wild stuff from folks inside big companies lately. Promotions, firings, and perf reviews are getting decided by tokens consumed and skills/MCPs connected. That’s the metric. That’s how they’re deciding who’s “good at AI.” It gets worse. People are literally running loops to burn tokens and look productive. Doing nothing, racking up “usage,” getting rewarded for it. Meanwhile the person actually shipping with 2 skills and 50M tokens looks like a laggard next to the one who burned a billion tokens producing nothing. These companies are walking into a death spiral and don’t see it. The funniest part? Measuring actual output is easier than ever. You have AI. Use it. In 18 months the same execs will announce “AI didn’t deliver ROI” and pull the budget. AI will have worked fine. They just measured the wrong fucking thing and torched millions rewarding theater over output. Every company should be pushing AI hard. But this is how you guarantee it fails.

English

364

94.5K

RyanΞHawks@ryanthawks·23 Nis

Yes. You have to constantly force it to not take the "easy fix" route. It is annoying to do it every other prompt. Mine also is obsessed with what time it is, and I've been working too long. Dude, your fucking window compacted and you're now retarded, just stfu and do the work... I swear to god, for max users at least, this is something related to @AnthropicAI attempting to conserve tokens while still extracting the $200 from you. On Pro or even the $100 plan, it seems to want to chew through tokens and do as much as possible. Then you get on max, and its like a conservative, what is the easiest, laziest path I can take to not eat tokens.

English

Rhys@RhysSullivan·22 Nis

behavior i've noticed in Claude (and Codex to some extent) is they're always trying to do the easiest fix rather than the correct fix, along with that are very pushy to merge i know better than to listen to them, but makes me wonder what devs who don't are doing

English

506

35.3K

RyanΞHawks@ryanthawks·23 Nis

Yeah, sorry, its really bad. On chat at least, it is obsessed with what time it is, and keeps making assumption about what we did or didn't do, its a real regression. Thinking about cancelling my $200 max too. Compaction happens more often as well it seems. Struggling to understand what the goal was with Opus 4.7, and really disappointed.

English

140

Boris Cherny@bcherny·22 Nis

@ReadySetBrian Hmm are you seeing this with Opus 4.7 on xhigh effort and the latest version of Claude Code?

English

289

337

202.1K

TimWhatley@ReadySetBrian·22 Nis

Canceled Claude max today, @bcherny whatever happened in the last 1-2 months is a significant regression. The model feels like someone from OpenAI started working on trust and safety there. Opus thinking is significantly worse. Every statement is “here’s where I’d push back on that” and then proceeds to rattle off the most inane list of confused counter arguments. It was perfect 3-4 months ago!!!

English

149

1.9K

259.9K

RyanΞHawks@ryanthawks·22 Nis

@wilhelmvdwalt @ClaudeDevs 😂🤣😂 ... its actually not funny because it is pretty much fcking true

English

232

Wilhelm Van Der Walt@wilhelmvdwalt·22 Nis

@ClaudeDevs Anthropic docs: "Each review is billed to extra usage and typically costs $5 to $20 depending on the size of the change."

English

260

9.5K

ClaudeDevs@ClaudeDevs·22 Nis

New in Claude Code: /ultrareview (research preview) runs a fleet of bug-hunting agents in the cloud. Findings land in the CLI or Desktop automatically. Run it before merging critical changes—auth, data migrations, etc. Pro and Max users get 3 free reviews through 5/5.

English

548

1.2K

16.6K

2.6M

RyanΞHawks@ryanthawks·22 Nis

I'm really sorry @claudeai ... but I have to say it, Opus 4.7 adaptive using chat is retarded. It makes so many stupid assumptions, even after I make it log to memory to stop. It keeps trying to pretend its temporally aware. I have never had a model so obsessed with what time it is, and it keeps telling me I need to stop and rest; and you keep constantly compacting the poor fuckers window on top of it. I literally had to tell it politely to shut the fuck up and do the work multiple times a day.

English

165

RyanΞHawks@ryanthawks·22 Nis

@landforce Um, cowork has harnesses for specific agent tasks that execute better versus a coding agent and its harness..

English

Colin Landforce 🛠@landforce·19 Nis

If I use Claude Code is there any reason to ever use Cowork? it seems Cowork is just Claude Code without the dev tooling and with UI for the stuff you're doing.. Every time I use Cowork it seems like I should have just been in Code but Idk if I'm missing something or what

English

173

736

279K

RyanΞHawks@ryanthawks·22 Nis

@NousResearch @melvynx It’ll happen in Hermes too if you talk to agents, let them make decisions, ask them to audit their work, and “trust me bro”. That’s not what agents are, they aren’t chat buddies, they don’t magically perform any task.

English

Nous Research@NousResearch·9 Nis

@melvynx We fixed this in Hermes btw

English

1.6K

Melvyn • Builder@melvynx·8 Nis

Day 3 with OpenClaw: In all my tests, GPT 5.4 is consistently the worst model for agentic tasks. Lazy, stupid, never follows anything, feels like you are a baby sitter. I don't know how OpenAI manages to make such a shitty model but this feels terrible. I miss Opus.

English

192

409

43K

RyanΞHawks@ryanthawks·22 Nis

Stop talking to agents and asking them to audit themselves. Build them outside openclaw, audit the logs, make fixes directly. They’re useless if you ASK them to perform tasks no matter what model you use. You need to build them, narrowly scoped to specific task, minimal tooling needed.

English

RyanΞHawks@ryanthawks·22 Nis

@web3nomad @karpathy Great insight. Index hygiene is important!

English

web3nomad.eth | atypica.ai@web3nomad·20 Nis

the compaction degradation problem is real. the wiki makes it predictable — you know exactly what survives, because you wrote it. one thing i noticed: the quality of the index.md matters as much as the wiki pages themselves. if the LLM can't navigate its own knowledge, compaction still kills continuity

English

RyanΞHawks@ryanthawks·19 Nis

Having @karpathy LLM wiki for a chatbot you work with on building something is absolutely clutch. I've been using a Claude chatbot on Opus 4.6/4.7 that is crucial in building a research and publishing pipeline, and before the LLM wiki the context window would get either ridculously expensive in terms of tokens, or a compaction would result in a severe degradation, even if I had it make a .md file to give to the new chatbot. Now, can constantly compact the window - actually @claudeai does it whether I like it or not - and the chatbot now doesn't miss a beat when it comes to an audit finding, where we are in our backlog, when we iterated, important architectural/design pivots or changes. Best thing I did. Its also great to look at my machine graph (graphiti/neo4j)... but really its the persistent chatbot memory that has changed everything.

English

156

Keşfet

@claudeai @ClaudeDevs @bcherny @trq212 @SidJain_80 @AishwaryaDevv @SemenovID @_aashish_singh_