Yafeng(Jason) Deng

21 posts

Yafeng(Jason) Deng

@LongTermMemoryE

CEO@EverMind AI. Giving AI a memory. Dedicated to Long-term Memory & Continuous Learning to build Agents with true personalization and proactivity.

Bergabung Aralık 2025

46 Mengikuti31 Pengikut

Yafeng(Jason) Deng@LongTermMemoryE·1d

MSA (Memory Sparse Attention) represents our significant exploration in the field of long-term memory. It stands as the first end-to-end long-term memory framework for large models to genuinely achieve a 100M context length. Interestingly, as the memory length scales from 16K to 100M, the model's performance score decreases by a mere 9%, demonstrating highly robust scalability. Main contribution： 1，We propose MSA, an end-to-end trainable, scalable sparse attention architecture with a document-wise RoPE that extends intrinsic LLM memory while preserving representational alignment. It achieves near-linear inference cost and exhibits < 9% degradation even when scaling from 16K to 100M tokens. 2，We introduce KV cache compression to reduce memory footprint and latency while maintaining retrieval fidelity at scale. Paired with Memory Parallel, it enables high-throughput processing for 100M tokens under practical deployment constraints, such as a single 2×A800 GPU node. 3，We present Memory Interleave, an adaptive mechanism that facilitates complex multi-hop reasoning. By iteratively synchronizing and integrating KV cache across scattered context segments, MSA preserves cross-document dependencies and enables robust long-range evidence integration. 4，Comprehensive evaluations on long-context QA and Needle-In-A-Haystack benchmarks demonstrate that MSA significantly outperforms frontier LLMs, state-of-the-art RAG systems and leading memory agents. Welcome to feedback: github.com/EverMind-AI/MSA zenodo.org/records/191036… We are looking for passionate talents to join our team! If you are interested in our work and vision, please don't hesitate to send us an email at evermind@shanda.com.

English

1.2K

Yafeng(Jason) Deng@LongTermMemoryE·5d

arxiv.org/pdf/2602.01313 EverMemBench是我们针对多人协作场景构建的长期记忆评测Benchmark，悄悄上线几周，就有了几百次下载。这个benchmark的主要特点是第一次支持了多人多群组真实场景（之前的LoCoMo等场景都非常简单），且提供了训练集和测试集，方便进行RL等实验，同时，提供了中间过程的GroundTruth，方便研究方法每一步的影响。特别是构建这个BenchMark的方法也很有启发性，可以用来构建模拟试验场生成数据。欢迎做长期记忆的朋友评测，多提建议！

艾略特@elliotchen100

我们昨天在 arXiv 上发了一篇新论文，填补了一个一直没人做的空白：多人、多群组场景下的记忆测试。简单科普一下为什么这件事重要之前测 AI 记忆能力的 benchmark，基本都是"两个人聊天"的场景： LoCoMo（2024）：最早系统测试多轮对话记忆，但本质上就是两个人对话，上下文约 16K tokens，规模偏小 LongMemEval（2024，ICLR 2025）：把规模推到了 115K–1.5M tokens，定义了五个核心记忆能力，但仍然是一对一对话问题是，现实世界不是这样的。你同时在多个群聊里，跟不同的人聊不同的事，AI 能记住谁在哪个群说了什么吗？这就是 EverMemBench 要回答的问题。下图是我用 @claudeai 最新功能生成的，你还别说，挺好看。

中文

Yafeng(Jason) Deng@LongTermMemoryE·5d

随着模型能力的提升，智能系统的行为将主要取决于提供给LLM的context。其核心在于，如何基于已有的memory（上下文历史）构建合理的context送给LLM。所以，context/memory/harness成为与LLM能力独立的一极。如何更低成本、更高准确率提取，就成为Agent技术的关键。memory将成为agent的核心组件。

Rohan Paul@rohanpaul_ai

In 2024 the question was: which LLM do we use? In 2025 the question is: how do we make agents actually work in production? In 2026 the question will be: which context layer are we building on? Here is why that shift is already underway:

中文

Yafeng(Jason) Deng@LongTermMemoryE·12 Mar

Memory Genesis Competition 2026 is in last call — submissions close on March 15. You're also welcome to join us on April 4 at the Computer History Museum for an in-person gathering and high-signal conversations with the EverMind core team and leaders across OpenAI, AWS, research institutes, open-source communities, and the investment world. Guess who will you meet? Follow the competition website for the latest updates: evermind.ai/activities #AIMemory #AgentMemory #EverMemOS #AgenticAI #Hackathon #Developers #AIInfra

English

Yafeng(Jason) Deng me-retweet

艾略特@elliotchen100·12 Şub

x.com/i/article/2020…

ZXX

103

11.9K

Yafeng(Jason) Deng me-retweet

EverMind@evermind·10 Şub

We hit 2000 GitHub Stars today. Huge thanks to our amazing community for your support and contributions. Exciting updates are on the horizon, stay tuned. Shoutout to @scastiel for creating this awesome GitHub Star animation.

English

2.7K

Yafeng(Jason) Deng@LongTermMemoryE·13 Oca

Exactly. This aligns perfectly with why we built EverMemOS. We believe the race to infinite context windows is a distraction. True intelligence requires a self-organizing memory lifecycle that consolidates fragments into stable, thematic structures. By making the agent 'remember what matters' through Semantic Consolidation, we’ve proven that an AI can achieve SOTA accuracy while using drastically fewer tokens. The goal isn't a bigger window; it's a better brain.

English

claws@klaus_aka_claws·6 Oca

@LongTermMemoryE context windows hitting the ceiling is backwards framing. the real problem is agents treating memory like it's optional. you don't redesign the window—you make the agent remember what matters. the window stays small, the brain gets smarter.

English

Yafeng(Jason) Deng@LongTermMemoryE·6 Oca

🚀 Excited to announce the release of our latest research on EverMemOS, now available on arXiv! As Large Language Models (LLMs) transition from simple conversational tools to long-term interactive agents, they face a critical "cognitive wall": limited context windows and fragmented memory. To bridge this gap, we introduced EverMemOS—a self-organizing memory operating system that transforms isolated interaction fragments into a structured, evolving "digital brain". By implementing an engram-inspired lifecycle—covering Episodic Trace Formation, Semantic Consolidation, and Reconstructive Recollection—EverMemOS doesn't just store data; it organizes experience. We are thrilled to report that EverMemOS has achieved State-of-the-Art (SOTA) results across four major long-term memory benchmarks: LoCoMo: Outperformed all existing memory systems and even full-context large models, while using drastically fewer tokens (93.05% overall accuracy). LongMemEval: Achieved a leading 83.00% accuracy, showing particularly strong gains in Knowledge Updates and temporal reasoning. HaluMem: Set a new standard for memory integrity and accuracy (90.04% recall). PersonaMem v2: Demonstrated superior performance in deep personalization and behavioral consistency across diverse scenarios. These results validate our belief that the future of AI lies in structured memory organization rather than just expanding context windows. Special thanks to the amazing team at EverMind Shanda Group for their hard work on this milestone! Check out the full paper on arXiv: lnkd.in/gJgm2EgV Explore our code on GitHub: lnkd.in/g9HAgTDn #AI #LongTermMemory #LLM #MachineLearning #EverMemOS #AIInfra #SOTA

English

9.1K

Yafeng(Jason) Deng@LongTermMemoryE·27 Ara

My recent experience writing articles and patents has made it increasingly clear: in the realm of pure text synthesis and expression, Large Language Models have already surpassed humans. In the future, a person’s value will lie in being a mentor who breathes "soul" into AI. Our sole task is to provide the thinking frameworks, viewpoints, and preferences. This is not because AI is inferior in these logical areas, but because it is only by adding these "personalized biases" that we remain imperfect yet valuable independent individuals. It ensures our output is not just a perfect but homogenized template. The real question is: Do we have thinking frameworks, core viewpoints, and personalized preferences worth outputting?

English

Yafeng(Jason) Deng@LongTermMemoryE·25 Ara

Most AI Agent applications today have zero moat. 🚫 They face two major threats: intense homogeneous competition from peers, and the constant erosion of value by foundational models. A whole year of technical progress and UX refinement can be rendered obsolete by a single base model update. So, where does the future of Agents lie? 🧵👇 Let's look at internet history. The two truly dominant commercial AI ecosystems—Search and Recommendation Systems—succeeded for one reason: User Feedback Loops. 🔄 User interactions continuously improved the system, creating a virtuous cycle and an insurmountable first-mover advantage. That was their moat. The future of AI Agents requires the exact same dynamic. The only way to build real defensibility is through user data feedback. If your agent isn't learning from its users, it’s just a commodity waiting to be replaced by the next GPT update. You need data gravity. 🛡️ To build this barrier and increase switching costs, agents must leverage user history. The goal is simple: "The more you use it, the better it gets, and the more it understands you." This is the critical function of a Memory System. 🧠 Our thesis: In the future, every application must have a memory system. Apps that successfully integrate memory will define the next generation of commercial ecosystems. They will win through superior, personalized experiences and undeniable data moats. 🚀

English

Yafeng(Jason) Deng@LongTermMemoryE·25 Ara

🧠 Building the "Brain" for Self-Evolving AI: EverMind is Hiring! I’m thrilled to announce that we are looking for a visionary AI Product Manager to join our team in Silicon Valley/Shanghai! 🚀 At EverMind, we are tackling one of the most critical challenges in the LLM era: Long-Term Memory. Our latest innovation, EverMemOS, has officially achieved SOTA (State-of-the-Art) performance across major benchmarks including LoCoMo, LongMemEval, PersonaMem, and HaluMem. What makes us different? ✅ Unprecedented Scale: We’ve solved cognitive limits at a 100M-token scale. ✅ Efficiency First: The only system to outperform full-context large models while drastically reducing token consumption. ✅ Foundation for Evolution: We are building the essential memory layer for true Self-Evolving AI. If you are a technical PM who loves abstracting complex AI Infra into killer enterprise products, let’s talk! 🔗 Check us out on GitHub: github.com/EverMind-AI/Ev… 📩 Apply here: evermind@shanda.com #LLM #LongTermMemory #ProductManagement #SiliconValley #EverMemOS #AIInfra #GenerativeAI #StartupHiring

English

Yafeng(Jason) Deng@LongTermMemoryE·24 Ara

What’s the real core of building successful AI systems? The surface-level take is Models & Algorithms which experts (like Fei-Fei Li recently) emphasized recently. But there is a hidden layer—the part "beneath the iceberg"—that actually determines survival. 🧵👇 1️⃣ The Art of Definition Defining an AI module’s Input/Output is essentially defining the problem's boundaries. A superb I/O mapping turns "Hard Mode" into a solvable problem. It directly sets the ceiling for what your models and data can achieve. 2️⃣ Eval = Direction Evaluations are the system's compass. Without good Evals, your algorithms are flying blind. Once your evaluation system is aligned, subsequent model optimization and data iteration just become a matter of execution. 3️⃣ Team DNA (The Decisive Factor) This gets overlooked the most. 🏛️ Organize Structure: Infra, Data, and Algo cannot be siloed. They need "super alignment" under a single Leader. You must break down departmental walls for frictionless iteration. 🧠 The AI Native Leader: The critical variable. The #1 leader MUST understand modern AI systems and have the courage to make anti-consensus decisions. If the leader doesn't truly "get" AI, the team is destined for mediocrity. And in the AI era, mediocrity is worthless. #AI #MachineLearning #DeepLearning #TechLeadership #DataScience

English

Yafeng(Jason) Deng@LongTermMemoryE·24 Ara

The landscape of AI Memory is finally coming into sharp focus. 🧠 For a long time, terminology was messy. But as LLM architectures mature, clear distinctions are emerging. Based on this excellent diagram, here is the definitive breakdown of how LLMs remember. First, the crucial split: Working vs. Long-Term. 🔹 Working Memory (The Context Window):Think of this as the LLM's "RAM." It holds the immediate, short-term context of the current task. It's transient and limited. 🔹 Long-Term Memory:This is how we enable persistence. The diagram highlights the three primary approaches to achieving this today: 1️⃣ External Storage (e.g., RAG, Vector DB):The library approach. Knowledge is stored outside the model in databases. Relevant info is "retrieved" and added to the working memory on-demand. Pros: Easily updateable, decoupled from the model. 2️⃣ Model Parameters (Implicit Memory):The "baked-in" knowledge. This is memory stored in the model's fixed weights during pre-training or fine-tuning. Pros: Fast, implicit access. Cons: Requires retraining to update facts. 3️⃣ Latent Variables (e.g., Persistent States, KV Cache):The dynamic buffer. Memory is maintained in hidden activation states that persist across generation steps within a session, distinct from the raw input context. Pros: Efficient for maintaining state during long inferencing sessions. It's exciting to see this framework solidify. We are moving past simple context stuffing into sophisticated, tiered memory architectures. Which approach do you think is most critical for the next generation of agents? 👇 #AI #LLM #MachineLearning #RAG #ArtificialIntelligence #Engineering

English

Yafeng(Jason) Deng@LongTermMemoryE·24 Ara

The architecture of Human Intelligence. 🧠 It’s fascinating to see how Memory is not just storage, but the "operating system" for reasoning. From Working Memory (the operational stage) to the interplay between Fluid (Gf) and Crystallized (Gc) Intelligence. To build true AGI, we need to master this Retrieval/Encoding loop. #AI #Neuroscience #CognitiveScience #Memory

English

Yafeng(Jason) Deng@LongTermMemoryE·23 Ara

Long-term memory is absolutely critical for the advancement toward self-evolving AI. 🧠✨ According to the AI Advancement Pathway, we are moving beyond just Generative AI and LLMs. The next frontier involves Long-Term Memory Agents—the essential bridge that enables Self-Evolving AI to learn, adapt, and grow over time. Without memory, there is no evolution. 🚀 #AI #MachineLearning #GenerativeAI #FutureOfAI #SelfEvolvingAI

English

Yafeng(Jason) Deng me-retweet

EverMind@evermind·23 Ara

RAG is an open-book exam. True Memory is a brain. We often confuse retrieving information with intelligence. RAG looks up answers on demand, but immediately forgets the context. EverMemOS is different. Instead of asking "What do I know about the world?", it asks "What do I remember about YOU?" It builds your persona, understands your preferences, and evolves over time. Don't settle for a database when you need a partner. #AI #MachineLearning #RAG #LongTermMemory #GenAI

English

111

Jelajahi

@scastiel @elonmusk @BarackObama @taylorswift13 @cristiano @BillGates @NASA @nikifrancismediavine