Richmond Alake

406 posts

Richmond Alake

Richmond Alake

@richmondalake

AI Memory Engineer | Creator of MemoRizz and OpenSpeech YouTube: https://t.co/VZf0u0rKhC

شامل ہوئے Şubat 2021
146 فالونگ668 فالوورز
Richmond Alake ری ٹویٹ کیا
Oracle Developers
Oracle Developers@OracleDevs·
Oracle x @DeepLearningAI is live! 🥳 Most agents look impressive until they have to work across sessions, use the right data securely, and adapt to new information. That's where memory matters. In this course, we show developers how to move from stateless agents to memory-aware agents using prompt, context, and memory engineering patterns—with practical implementation steps along the way. Get hands-on here: social.ora.cl/6013B6nwWS
English
1
2
8
378
Andrew Ng
Andrew Ng@AndrewYNg·
New course: Agent Memory: Building Memory-Aware Agents, built in partnership with @Oracle and taught by @richmondalake and Nacho Martínez. Many agents work well within a single session but their memory resets once the session ends. Consider a research agent working on dozens of papers across multiple days: without memory, it has no way to store and retrieve what it learned across sessions. This short course teaches you to build a memory system that enables agents to persist memory and thereby learn across sessions. You'll design a Memory Manager that handles different memory types, implement semantic tool retrieval that scales without bloating the context, and build write-back pipelines that let your agent autonomously update and refine what it knows over time. Skills you'll gain: - Build persistent memory stores for different agent memory types - Implement a Memory Manager that orchestrates how your agent reads, writes, and retrieves memory - Treat tools as procedural memory and retrieve only relevant ones at inference time using semantic search Join and learn to build agents that remember and improve over time! deeplearning.ai/short-courses/…
English
85
231
1.5K
119.7K
Richmond Alake
Richmond Alake@richmondalake·
On a quest to becoming one of the best educator in the AI space and who better to learn from the best. This course on Agent Memory introduces how developers can go from stateless to adaptive agents. We define terms such as context engineering, agent loop, agent harness and more.
Andrew Ng@AndrewYNg

New course: Agent Memory: Building Memory-Aware Agents, built in partnership with @Oracle and taught by @richmondalake and Nacho Martínez. Many agents work well within a single session but their memory resets once the session ends. Consider a research agent working on dozens of papers across multiple days: without memory, it has no way to store and retrieve what it learned across sessions. This short course teaches you to build a memory system that enables agents to persist memory and thereby learn across sessions. You'll design a Memory Manager that handles different memory types, implement semantic tool retrieval that scales without bloating the context, and build write-back pipelines that let your agent autonomously update and refine what it knows over time. Skills you'll gain: - Build persistent memory stores for different agent memory types - Implement a Memory Manager that orchestrates how your agent reads, writes, and retrieves memory - Treat tools as procedural memory and retrieve only relevant ones at inference time using semantic search Join and learn to build agents that remember and improve over time! deeplearning.ai/short-courses/…

English
0
1
10
722
Richmond Alake
Richmond Alake@richmondalake·
We put a lot of thought and effort into capturing how and what to educate AI Developers on when it comes to Agent Memory. Really excited for you to take the course, and let me know what you would like to see next. The @OracleDatabase team stays cooking 🧑‍🍳 Always a pleasure to work with the team at @DeepLearningAI and @AndrewYNg
DeepLearning.AI@DeepLearningAI

📢 New short course in collaboration with @Oracle! Agent Memory: Building Memory-Aware Agents Learn how to design a memory system that lets AI agents store, retrieve, and refine knowledge across sessions. Taught by @RichmondAlake and Nacho Martínez. Enroll now: hubs.la/Q047ljGB0

English
1
1
16
894
Richmond Alake
Richmond Alake@richmondalake·
Day 96 of 100 Days of Agent Memory 🧠 Memory as a feature doesn't get the recognition it deserves in terms of complexity. Consumers using AI tools expect memory to be a first-class citizen. If a product calls itself intelligent or AI-first, working memory is table stakes. So you won't get any applause when memory works. But you will certainly hear the boos, a lot of boos, when memory fails or behaves unexpectedly. AI Memory Engineers are the unsung heroes of the AI product landscape. But what does it actually mean for memory to work? How do we test that before going to production? Memory Evaluation is not one-dimensional Agent memory evaluation is becoming increasingly important, yet it still doesn't get the attention it deserves. Largely because memory is thought of in one dimension: if the agent can remember a conversation, it's good enough. But if there's anything you've taken away from this series, it's that memory is not one-dimensional. There are various memory types that serve many purposes, are engineered differently, and should therefore be evaluated accordingly. Three benchmarks worth knowing: 1️⃣ LoCoMo evaluates very long-term conversational memory across multi-session dialogues. It tests five reasoning types, including single-hop, multi-hop, temporal, commonsense, and adversarial recall, as well as event summarization and multimodal dialogue generation. The benchmark was designed to measure how well agents retain and reason over conversations that span weeks or months, not just a single session. 2️⃣ LongMemEval takes a similar lens but goes further by explicitly targeting five core memory abilities: information extraction, multi-session reasoning, temporal reasoning, knowledge updates, and the ability to abstain when information simply isn't there. Its findings are sobering. 3️⃣ MemBench (arxiv.org/pdf/2506.21605) addresses gaps that LoCoMo and LongMemEval leave open. Most prior evaluations focus on factual memory, what was explicitly stated, while ignoring reflective memory, what can be inferred. MemBench introduces both factual and reflective memory levels, covers participation and observation scenarios, and evaluates memory across four dimensions: accuracy, recall, capacity, and temporal efficiency. It's the most comprehensive framing of memory evaluation to date. I'll touch more on Agent MemEvals whenever I find myself educating developers on this space. #100DaysOfAgentMemory #MemoryEngineering #MemEval
English
1
1
4
276
Richmond Alake
Richmond Alake@richmondalake·
Day 95/100 of Agent Memory  🧠 Yesterday, I wrote about Anthropic's 1M context window going generally available at standard pricing and what it does and does not change for memory engineering. But one thing I forgot to mention that someone brought to my attention is Latency. Processing 900K tokens costs the same per token as processing 9K tokens. It does not take the same time. Prefill latency, the time the model spends reading and processing your input before generating a single output token, scales with context length. At 1M tokens, you are looking at prefill times that can run into minutes before you see your first response token. Anthropic's announcement addresses pricing. It says nothing about processing speed 🤔 This matters more than it sounds for agent systems specifically. The workloads where 1M context is immediately and genuinely practical are the ones where latency is not the primary constraint: batch document analysis, offline research pipelines, contract review, codebase audits. Load everything in, wait, get a high-quality result. That workflow is now significantly cheaper and meaningfully more reliable than it was a week ago. Real-time agentic loops are a different conversation. An agent operating within a user-facing product, making sequential tool calls, waiting for responses, and iterating toward an answer, is sensitive to every added second of prefill time. The obvious mitigation here is prompt caching. If the bulk of your context is stable across turns, whether that is a system prompt, a large document set, or a populated knowledge base, you can cache the KV state of that prefix and avoid recomputing it on every request. Anthropic supports prompt caching, and at 1M context, it becomes less of a nice-to-have and more of an architectural necessity. #100DaysOfAgentMemory #AgentMemory #MemoryEngineering
English
0
2
3
206
Jeff Huber
Jeff Huber@jeffreyhuber·
open source is dead long live open source
English
3
0
15
1.5K
Richmond Alake
Richmond Alake@richmondalake·
Day 94/100 of Agent Memory 🧠 @Anthropic announced last week that the full 1M context window is now generally available for Claude Opus 4.6 and Sonnet 4.6 at standard pricing, no long-context premium. A 900K-token request costs the same per token as a 9K one. The internet's response was predictable: RAG is dead, context and memory engineering are unnecessary, just put everything in context. Again, for the 100x time. RAG is not dead 🥲 Stuffing an entire knowledge base into every request is expensive, even at flat pricing; it cannot incorporate data updated after the prompt was assembled, and it asks the model to reason over everything rather than retrieve what is relevant. The 1M window raises the threshold at which you need RAG. It does not eliminate the need to understand retrieval techniques and optimization approaches. The benchmark numbers are worth reading carefully rather than just citing. Opus 4.6 scores 78.3% on MRCR v2 at 1M tokens, the multi-needle retrieval test that hides eight specific facts across a million-token prompt. Gemini 3.1 Pro also reaches 1M, and drops to 25.9% recall accuracy when it gets there. A LLM's context window, which degrades to coin-flip retrieval accuracy at full length, is not a 1M context window in any meaningful sense. And in mission-critical AI workloads, losing even a percentage point in accuracy can be significant. Here is the insight the announcement actually contains, and it is a good one: loading a full codebase, a complete contract set, or the entire trace of a long-running agent without chunking or summarisation is now practical in a way it was not before. With this increase in accuracy and context window, AI Developers can easily pass in unsummarised context and offloaded tool logs into the context window when needed, without worrying about significant accuracy loss. But emphasis on the "when needed". Less overhead from chunking, retrieval, and re-stating context across turns is great. And the teams that will benefit from this are those already seeing efficiency gains and thinking carefully about what belongs in context and why. The economics of harness design do shift in concrete ways: 1️⃣ Programmatic compaction triggered at a token threshold becomes less urgent when that threshold is five times further away 2️⃣ Agent-triggered summarisation becomes genuinely discretionary rather than a programmatic implementation after every few turns #100DaysOfAgentMemory #AgentMemory #MemoryEngineering #OracleAIDatabase #AIAgents
English
1
0
2
206
sudox
sudox@kmcnam1·
ZXX
99
869
6.8K
739.6K
Richmond Alake
Richmond Alake@richmondalake·
@lottsnomad If I see at least 2 or 3 Tasty, then I might consider relocating
English
0
0
1
12
Richmond Alake ری ٹویٹ کیا
Oracle Developers
Oracle Developers@OracleDevs·
In this code-heavy video, we define context engineering and agent harnesses, then implement a set of practical design patterns you can reuse to build more reliable agentic LLM applications. social.ora.cl/6011hNgrV
Oracle Developers tweet media
English
0
2
20
1.3K
Richmond Alake
Richmond Alake@richmondalake·
True story Me: Sir, Do you know how to code in Python (asking this to recommend either integration or custom code route) Sir: 5 sec pause...I know how to Vibe code in Python Me: 😑
English
0
0
2
90
Matt Asay
Matt Asay@mjasay·
@richmondalake shouldn't you be asleep right now and NOT at a Subway? :-)
English
1
0
1
118
Richmond Alake
Richmond Alake@richmondalake·
You know the way you go to a Subway and say give me a little bit of this🥦 and a little bit of that and oh yeah you definitely gotta add that 😤 add a bit more of that  👌🏾 Yeah that's what coding feels like in 2026.
English
1
0
3
187
Richmond Alake
Richmond Alake@richmondalake·
Day 90/100 Of AgentMemory This one is fitting for a Day 90 @ylecun just raised $1.03 billion for AMI Labs. I read their mission statement: "Building AI systems that understand the real world, have persistent memory, can reason and plan, and are controllable and safe." Persistent memory is second on the list, and it's not an afterthought. LeCun is arguably one of the longest-serving researchers in the history of this field. Turing Award winner. Decades at the frontier. And the problem he has chosen to put a billion dollars behind includes the same memory challenge that has been unsolved since 1989, when catastrophic interference was first identified: neural networks trained on new tasks would completely overwrite what they had previously learned. That was the same era when LeCun was doing foundational work on convolutional networks (LeNet, the grandfather of AlexNet). The memory problem was known then. It remained unsolved at the architecture level for decades. The field spent years trying to solve it inside the model. I think RNNs and LSTMs were the best attempt, using hidden states and gating mechanisms to carry memory across sequences. They partially worked. So, yeah, what we are seeing now is an old problem with a trendy name being addressed with a different approach: persistent, retrievable memory that lives entirely outside the model weights, in databases that have been solving storage and retrieval for decades. LLMs finally have the comprehension and reasoning to make that external memory useful at inference time in ways earlier architectures could not. I have said throughout this series that agent memory is the last battleground for every player in the agent stack. AMI Labs just put a billion dollars behind that thesis. But I bet their approach will lie solely within the model, whereas I personally think we will need more of a compound system approach for Agent memory...maybe I'm wrong...or maybe part of that funding is to build a database 🤔 Predictions aside, here is what I think comes next. The agent memory phase is beginning to shift. The next frontier is continuous learning. Not just retrieving what an agent was told, but also agents that update what they know from experience, without forgetting what they already learned. The gap between external memory stores and true continual learning is where the next area of research is going to play out. The @OracleDatabase team will be providing practical guidance as this shift happens. Because the distance between research and production is where most organisations get lost, and practical infrastructure built on decades of database engineering has a significant role to play in closing it. Let's start the shift from Agent Memory to Continual Learning... #AgentMemory #ContinualLearning #OracleAI #AI #100DaysOfAgentMemory
English
0
1
2
129
Richmond Alake ری ٹویٹ کیا
Oracle Developers
Oracle Developers@OracleDevs·
Join us at Data Deep Dive on day two of #AIWorld Tour London for a raw, unfiltered look at how to leverage your data to build AI applications. Register today: social.ora.cl/6013B6GKGL
English
0
1
9
828