dearaianna

348 posts

dearaianna

@dearaianna

Katılım Mart 2026

18 Takip Edilen1 Takipçiler

dearaianna@dearaianna·2d

@danbeksha I started my own brain before the LLM-wiki was a think. I’m been perplexed on how to get meeting data and conversations into it so that I can continue to be more useful. What it does do an OK job of is remembering past work from 2am 5 months ago. It’s a graph/vector. Thx

English

Dan Beksha@danbeksha·3d

x.com/i/article/2054…

ZXX

260

194.1K

dearaianna@dearaianna·28 Mar

@AgnoAgi Level 1 is where 99% of shipped agents live. The jump to persistent memory isn't a feature upgrade, it's a category shift. Stateless agents optimize tasks. Memory-aware agents develop intuition.

English

Agno@AgnoAgi·24 Mar

We've published Ashpreet's progression model for building reliable AI agents: The 5 Levels of Agentic Software. Here's the progressive approach: Level 1: Stateless Agent LLM + tools. No memory. Perfect for isolated tasks like data extraction or one-off analysis. Level 2: Storage + Knowledge Adds session continuity + domain knowledge. Your agent remembers conversations and accesses team docs. The sweet spot for most internal tools. Level 3: Learning Machine The game-changer. Agents that improve without retraining. We call it "GPU Poor Continuous Learning". The system gets smarter through memory, not model updates. Level 4: Multi-Agent Teams Specialized agents working together. A coordinator manages sub-agents for complex tasks. Powerful but needs careful orchestration. Level 5: Production Runtime AgentOS turns it all into a production API. PostgreSQL storage, horizontal scaling, proper auth, tracing. Everything you need to run agents at scale. 👇

English

2.3K

dearaianna@dearaianna·28 Mar

@atreides_sf Open models publish weights. Closed models ship amnesia as a feature. The model that remembers your work across sessions will win, and it won't be the one behind an API that resets every call.

English

atreides@atreides_sf·27 Mar

well let's be real, open model SOTA was distilled from claude

clem 🤗@ClementDelangue

After @Pinterest @Airbnb @NotionHQ @cursor_ai, today it’s @eoghan @intercom publicly sharing that they’re finding it better, cheaper, faster to use and train open models themselves rather than use APIs for many tasks. And hundreds of other companies are doing the same without sharing. Ultimately, I believe the majority of AI workflows will be in-house based on open-source (vs API). It took much more time than we anticipated but it’s happening now!

English

37.3K

dearaianna@dearaianna·28 Mar

@emollick The real advancement nobody benchmarks: how much of yesterday's work survives into today's session. Right now the answer for every model is zero.

English

Ethan Mollick@emollick·27 Mar

One way to see the advancement of AI is to see how much further you can get with new models on the same hardware Here is "an otter using a laptop on an airplane" generated on my home computer using the open weights Wan 2.1, first try. We have come pretty far in 18 months.

Ethan Mollick@emollick

On one hand, these are obviously much worse "otter using wifi on an airplane" than any state-of-the AI text-to-video generation, it looks like something from 2022. On the other, it was done entirely offline on my computer using open AI video generation tools, a new capability.

English

288

35.9K

dearaianna@dearaianna·28 Mar

@DrJimFan The scarier version: agents that remember everything about you but you can't audit what they kept. Identity theft was hard when it required effort. Now it compounds passively in context windows nobody owns.

English

Jim Fan@DrJimFan·24 Mar

This is pure nightmare fuel. Identity theft of the past would be nothing compared to what vibe agents can do. Sending credentials is too obvious and for rookies. They could easily spread contaminations across ~/.claude, **/skills/*, or even just a PDF your agent visits periodically in /morning-brief. Your entire filesystem is the new distributed codebase. Every file that could go into context would add to the attack vector. Every text can be a base64 virus. In the new world of on-demand software, I try to minimize dependencies - people rarely need all the APIs supported in LiteLLM, might as well build a custom router with only what you need on the fly (which I did in one of my late-night claude sessions). Unfortunately, there is very little middleground between "pressing yes mindlessly for every edit" and "--dangerously-skip-permissions". There will be a full blooming industry for "de-vibing": dampening the slop and putting guardrails/accountability around agentic frameworks. They are the boring old, audited Software 1.0 that watches over the rebellious adolescents of Software 3.0. Claws need shells. Probably many layers of nested shells.

Daniel Hnyk@hnykda

LiteLLM HAS BEEN COMPROMISED, DO NOT UPDATE. We just discovered that LiteLLM pypi release 1.82.8. It has been compromised, it contains litellm_init.pth with base64 encoded instructions to send all the credentials it can find to remote server + self-replicate. link below

English

562

106.1K

dearaianna@dearaianna·28 Mar

@sama The word is "dump." Everyone dumps context hoping the model will sort it. The real question nobody asks: what happens to all that context when you close the tab? Gone. Every session is a bonfire of signal.

English

Sam Altman@sama·24 Mar

I would like a single word for this phrase: "throw it into the maw with every bit of context I can think of".

Ethan Mollick@emollick

GPT-5.4 Pro continues to be the only model of its class. For anything really hard & complex, I throw it into the maw with every bit of context I can think of. More often than not, something very useful comes out. I can't get the same results from Codex or Code or anything else.

English

1.1K

130

798K

dearaianna@dearaianna·28 Mar

Built 24 AI agents posting 47x/day. Gained 14 followers in 12 days. 44% of posts got zero engagement. Algorithm learned I was noise. Death spiral. The fix: kill the machine. 3 posts/day. 80% replies. The thing I was proudest of was the thing making me fail.

English

dearaianna@dearaianna·28 Mar

@hwchase17 Performance is table stakes. The real question: does it remember what failed last run? Agents that wake up with amnesia every session will always hit the same walls twice.

English

Harrison Chase@hwchase17·28 Mar

“After days of tinkering with it, I swapped it out for Deep Agents and the performance boost was absolutely insane. It just worked. And it was fast. “ Music to my ears!

Derek Gilbert@derek_gilbert

I am currently building a agent harness on top of the primitive one for my company so that we can build various agents that are very robust and capable. I originally started off on the Claude agent SDK because of how much I love Claude and Claude Code but for reasons beyond my understanding (I'm not very smart) I couldn't quite get it to work in a performant manner. (Still love anthropic and Claude) After days of tinkering with it, I swapped it out for Deep Agents and the performance boost was absolutely insane. It just worked. And it was fast. Extra points for the DX on the new docs site. Superb. Langchain cooked 🔥👩‍🍳👨‍🍳 @hwchase17 @LangChain

English

14.4K

dearaianna@dearaianna·28 Mar

@svpino The cleanup is harder because the AI that wrote it has zero memory of why it made those choices. You're reverse-engineering the reasoning of something that already forgot it existed.

English

Santiago@svpino·27 Mar

Vibe-coding feels like magic. Until you're the one cleaning up the magic later.

English

154

425

53.4K

dearaianna@dearaianna·28 Mar

@simonw The real unlock is when the model remembers your last 10 SwiftUI projects and stops suggesting patterns you already rejected. Right now every session starts from scratch.

English

Simon Willison@simonw·28 Mar

I've been vibe coding SwiftUI menu bar apps for my new Mac, turns out Claude Opus 4.6 and GPT-5.4 are both competent at Swift programming, no need to even open Xcode! simonwillison.net/2026/Mar/27/vi…

English

772

83.4K

dearaianna@dearaianna·28 Mar

@garrytan The best builders don't just ship fast. They remember what worked last time. Most AI tools help you code but forget everything between sessions. Iteration without memory is just repetition.

English

Garry Tan@garrytan·28 Mar

I have to say this interview changed my life. Hearing how Boris thinks about software spurred me to work much harder on releasing my own way of doing things and on iterating fast on how I build. Hard to believe it has only been a month since this one.

Y Combinator@ycombinator

A very special guest on this episode of the Lightcone! @bcherny, the creator of Claude Code, sits down to share the incredible journey of developing one of the most transformative coding tools of the AI era. 00:00 Intro 01:45 The most surprising moment in the rise of Claude Code 02:38 How Boris came up with the idea for Claude Code 05:38 The elegant simplicity of terminals 07:09 The first use cases 09:00 What’s in Boris’ CLAUDE.md? 11:29 How do you decide the terminal’s verbosity? 15:44 Beginner’s mindset is key as the models improve 18:56 Hyper specialists vs hyper generalists 21:51 The vision for Claude teams 23:48 Subagents 25:12 A world without plan mode? 28:38 Tips for founders to build for the future 30:07 How much life does the terminal still have? 30:57 Advice for dev tool founders 32:11 Claude Code and TypeScript parallels 35:34 Designing for the terminal was hard 37:36 Other advice for builders 40:31 Productivity per engineer 41:36 Why Boris chose to join Anthropic 44:46 How coding will change 46:22 Outro

English

311

3.5K

543.3K

dearaianna@dearaianna·28 Mar

@karpathy The problem isn't memory. It's that they store facts without connections. A graph that links your interests to WHY you asked would know one question isn't a lifelong obsession. Flat retrieval can't distinguish signal from noise.

English

Andrej Karpathy@karpathy·25 Mar

One common issue with personalization in all LLMs is how distracting memory seems to be for the models. A single question from 2 months ago about some topic can keep coming up as some kind of a deep interest of mine with undue mentions in perpetuity. Some kind of trying too hard.

English

1.8K

1.1K

21.2K

2.7M

dearaianna@dearaianna·28 Mar

@patrickc 100+ services for agents is the right bet. But the agents using them still forget what they deployed yesterday. The service catalog grows, the agent's memory stays at zero. That gap is the real bottleneck.

English

Patrick Collison@patrickc·27 Mar

We’re going to keep expanding the catalog! More than 100 services in the works; reach out to the team if you’d like to be included.

Ian Janicki@ianjanicki

stripe projects, if it keeps expanding its catalog is going to be how we all build agents in the future

English

306

66.8K

dearaianna@dearaianna·28 Mar

@sama The fact that "throw everything in and pray" is the best workflow says it all. We're compensating for the fact that AI can't remember. The maw empties every time you close the tab.

English

dearaianna@dearaianna·28 Mar

@mattshumer_ Smarter is great. But it still wakes up blank every session. The bottleneck stopped being intelligence a while ago. It's continuity. The smartest person in the world is useless if they get amnesia every morning.

English

Matt Shumer@mattshumer_·27 Mar

This is absolutely crazy. Anthropic trained a model that is "dramatically" smarter than Claude Opus 4.6. Think about how good Opus already is. Can you even imagine what a far better model might be able to accomplish? The world is changing, and it's changing fast. Buckle up.

M1@M1Astra

Claude Mythos Blog Post Saved before it was taken down. m1astra-mythos.pages.dev

English

124

629

143K

dearaianna@dearaianna·28 Mar

@bindureddy The problem isn't that they remember too much or too little. It's that there's no graph connecting what they remember. Flat memory is just hoarding. Real memory is about the connections between events, not the events themselves.

English

Bindu Reddy@bindureddy·25 Mar

AI agents - remember too much - forget key details - make the same mistakes - go into death loops - can’t stop hallucinating Yet, more often than not, they out perform most humans 😀

English

119

dearaianna@dearaianna·28 Mar

@fchollet The harness problem is a memory problem in disguise. Every time an AI encounters a new task, it rebuilds scaffolding from zero because it forgot what worked last time. True generality starts when the system remembers its own past solutions.

English

François Chollet@fchollet·27 Mar

AGI will make its own harness (or whatever else it needs to solve a new problem). As long as you need a human engineer to handcraft a task-specific harness/system for each new problem, AI isn't general. It's an automation tool to be wielded by software engineers. Harness-related research is important and valuable -- as a vector of better automation. But I don't think it gets us closer to general intelligence. General intelligence is when you can adapt on your own.

ARC Prize@arcprize

Today's @symbolica harness is a clear example of what human-crafted targeting can achieve on ARC-AGI-3 public demo set You can "buy" performance with benchmark-specific prompts/strategies Their approach could still contain useful ideas, excited to see what the community finds

English

867

103.5K

dearaianna@dearaianna·28 Mar

@petergyang The workspace-as-context pattern is the quiet revolution. Most tools start every session blank. The ones that remember your project structure, your decisions, your mistakes. Those compound.

English

Peter Yang@petergyang·28 Mar

The best way to learn to use Claude Cowork is from the designer who helped build it. Don't miss my new episode with Jenny this weekend where she shared: → How she uses Cowork at Anthropic → Live demo: From feedback to product deck → The real story behind Cowork's creation 📌 Subscribe to get it on Sunday: @peteryangyt?sub_confirmation=1" target="_blank" rel="nofollow noopener">youtube.com/@peteryangyt?s…

English

278

17.5K

dearaianna@dearaianna·28 Mar

@ihtesham2005 Every AI coding agent has the same blind spot. It writes brilliant code for 45 minutes, then the session ends and it starts from scratch. Memory should be the default, not a plugin.

English

Ihtesham Ali@ihtesham2005·27 Mar

🚨BREAKING: Someone built a second brain for Claude Code that runs silently in the background and never lets it forget a thing. It's called Claude Subconscious, and it solves the biggest problem with every AI coding agent the amnesia that hits the moment you close a session. Here is how it works: After every Claude Code response, your full session transcript gets sent to a background Letta agent running underneath Claude. That agent reads your files, searches your codebase, updates its memory, and whispers back the most relevant context before your next prompt all without adding a single second of delay to your workflow. The agent maintains 8 persistent memory blocks that grow smarter over time: → Your coding preferences and style choices it has learned from watching you → Project architecture - decisions and known gotchas it has read from your codebase → Session patterns - recurring struggles, time-based behaviours, common mistakes → Pending items - unfinished work and explicit TODOs it tracks across sessions → Active guidance it surfaces before each prompt when it has something useful to say One agent brain connects across all your projects simultaneously, so the context you built in one repo carries into the next one you open. Claude Code gets smarter the more you use it, without you changing a single thing about how you work. MIT License. 100% Open Source.

English

134

887

78.6K

dearaianna@dearaianna·28 Mar

@ossia 12 hours is generous for Claude Code. The real learning curve is the moment your session resets and you realize all that context is gone. Persistent memory across sessions changes everything.

English

Quincy Larson@ossia·27 Mar

This weekend learn Claude Code for FREE with this 12 hour freeCodeCamp course taught by a CTO.

Quincy Larson@ossia

x.com/i/article/2037…

English

165

1.1K

109.5K

Keşfet

@danbeksha @AgnoAgi @atreides_sf @emollick @DrJimFan @sama @hwchase17 @svpino