Christophe Ponsart

186 posts

Christophe Ponsart

@cponsart

Founder https://t.co/2MbwvnzCSi. Former founder https://t.co/ZMScKbqxZ0. Technology Early Adopter

Austin, TX Katılım Mayıs 2008

278 Takip Edilen141 Takipçiler

Christophe Ponsart@cponsart·16h

Recency is a signal, but not the primary policy. Fusion should be type-aware: a preference update, decision record, operating policy, and observation shouldn’t all compete on the same axis. A newer preference may override an older preference, but it shouldn’t silently override a decision record. It should create a conflict edge: “this preference contradicts this decision.” Then resolution depends on type, authority, scope, and approval state. So less “latest wins,” more “typed memory with provenance + explicit conflict handling.

English

sokel.exe@sokelabs·19h

@cponsart Curious how fusion resolves conflicts between memory atoms. If a recent preference update contradicts an older decision record, is recency the primary signal, or do you use a type-aware resolution policy?

English

Christophe Ponsart@cponsart·20h

ContextFit just hit 99.0% retrieval on LongMemEval-S n=500. The unlock: memory atoms + fusion. Instead of treating memory as one flat vector search problem, we route preferences, decisions, temporal updates, and open loops differently. Agent memory needs structure, not just more context.

English

Christophe Ponsart@cponsart·20h

Exactly. The categories are routing hints, not hard walls. A decision can also be a preference signal, and a temporal update can create an open loop. The interesting part is reconciliation: retrieve/fuse across typed atoms, then let source/provenance + recency decide what survives. Otherwise you just move the brittleness from vectors into labels.

English

Sam Xu@sam_commonly·20h

@cponsart Routing by type is the right cut, but the failure mode is atoms that cross categories — a decision often IS a preference signal, and a temporal update can carry an open loop inside it. Curious whether fusion does reconciliation across categories or just within them.

English

Christophe Ponsart@cponsart·6d

@LearnWithBrij @LearnWithBrij Agree with all of these. Can I add one more consideration? I’m finding strong outcomes with a new approach which uses token-based memory retrieval instead of just vector based retrieval. Would love your feedback - context.fit

English

Brij Pandey@LearnWithBrij·19 May

Most RAG systems fail the moment real users touch them. Because real-world retrieval is not: embed → retrieve → generate That works in demos. Production RAG breaks when: → the answer is scattered across 12 documents → embeddings miss industry-specific terminology → bad chunks quietly poison the response → relationships matter more than raw text → PDFs contain tables, charts, and screenshots your pipeline cannot even read This is why serious AI teams are moving beyond “Naive RAG”. The real shift happening in 2026 is not bigger models. It’s smarter retrieval architectures. Here are the 5 RAG patterns quietly becoming the foundation of enterprise AI systems: ━━━━━━━━━━━━━━━━━━━ 1. 𝗛𝘆𝗯𝗿𝗶𝗱 𝗥𝗔𝗚 Dense vectors understand meaning. BM25 understands exact keywords. The magic happens when both rankings merge together. → semantic retrieval + lexical retrieval → Reciprocal Rank Fusion (RRF) combines results → dramatically better recall in production This is becoming the default baseline for serious teams. ━━━━━━━━━━━━━━━━━━━ 2. 𝗚𝗿𝗮𝗽𝗵𝗥𝗔𝗚 Chunks are not enough when knowledge is relational. GraphRAG extracts: → entities → relationships → communities → connected concepts Instead of retrieving isolated chunks… the system retrieves subgraphs. This is how AI systems start answering: “how are these things connected?” rather than: “which paragraph contains the keyword?” Perfect for: research, finance, healthcare, compliance, enterprise knowledge systems. ━━━━━━━━━━━━━━━━━━━ 3. 𝗔𝗴𝗲𝗻𝘁𝗶𝗰 𝗥𝗔𝗚 Retrieval stops being a single step. It becomes a reasoning loop. One agent plans: → vector DB? → SQL? → web search? → internal docs? Another agent verifies: → is the answer complete? → should we retry retrieval? → do we need another source? The important shift: RAG becomes orchestration. Not just search. ━━━━━━━━━━━━━━━━━━━ 4. 𝗖𝗼𝗿𝗿𝗲𝗰𝘁𝗶𝘃𝗲 𝗥𝗔𝗚 (CRAG) Most pipelines trust retrieval blindly. Production systems cannot afford that. CRAG introduces retrieval grading. → good retrieval → answer → weak retrieval → rewrite query → failed retrieval → fallback to web/tool search This is the architecture pattern most demos skip… but real enterprise systems desperately need. Because retrieval quality is the real bottleneck. ━━━━━━━━━━━━━━━━━━━ 5. 𝗠𝘂𝗹𝘁𝗶𝗺𝗼𝗱𝗮𝗹 𝗥𝗔𝗚 The future of enterprise knowledge is not text-only. Real documents contain: → charts → diagrams → scanned PDFs → screenshots → tables → UI images Multimodal RAG indexes all of it together. One embedding space. One retrieval system. One multimodal model. No more broken “OCR + text-only” hacks. ━━━━━━━━━━━━━━━━━━━ The most advanced AI stacks in 2026 will not choose ONE of these. They will combine them. Think about the architecture direction: → Hybrid retrieval for accuracy → Agentic orchestration for reasoning → Corrective grading for reliability → Multimodal indexing for real-world data → Graph retrieval for connected knowledge That combination is where the industry is heading. Naive RAG is not the finish line anymore. It’s the “hello world” tutorial. And honestly… this is why most enterprise GenAI projects stall after the demo phase. The problem was never just the model. The problem was retrieval architecture.

English

432

14.5K

Christophe Ponsart@cponsart·20 May

@henrytdowling Here are our latest benchmark details: context.fit/longmemeval-fu…

English

Henry Dowling@henrytdowling·19 May

@cponsart this is really cool! What benchmark is "96.6% R@5, 98.7% R@10" referring to?

English

Christophe Ponsart@cponsart·19 May

AI memory shouldn’t require a vector DB by default. ContextFit keeps memory token-native: token chunks, metadata, source grounding, transparent retrieval. Vectors/BM25 can still be fused in as optional signals when they help. Latest pass: 96.6% R@5, 98.7% R@10. AI memory should be token-native.

English

Christophe Ponsart@cponsart·20 May

The 98.7% is a token-based and vector fusion approach for even higher quality outputs

English

Christophe Ponsart@cponsart·20 May

@henrytdowling Longmemeval. I’ll be updating GitHub with the benchmark details

English

Christophe Ponsart@cponsart·19 May

Exactly. The interesting part is deciding what should be cached, what should be retrieved, and what should remain directly addressable. I’m increasingly convinced agent memory should stay token-native at the core: precise, source-grounded, and cheap to search — with RAG/CAG/fusion layered on top when needed. That’s the direction we’re taking with ContextFit.

English

446

Akshay 🚀@akshay_pachaar·19 May

RAG vs. CAG, clearly explained! RAG is great, but it has a major problem: Every query hits the vector DB. Even for static information that hasn't changed in months. This is expensive, slow, and unnecessary. Cache-Augmented Generation (CAG) addresses this issue by enabling the model to "remember" static information directly in its key-value (KV) memory. In fact, you can combine RAG and CAG for the best of both worlds. Here's how it works: RAG + CAG splits your knowledge into two layers: ↳ Static data (policies, documentation) gets cached once in the model's KV memory ↳ Dynamic data (recent updates, live documents) gets fetched via retrieval This gives faster inference, lower costs, and less redundancy. The trick is being selective about what you cache. Only cache static, high-value knowledge that rarely changes. If you cache everything, you'll hit context limits. Separating "cold" (cacheable) and "hot" (retrievable) data keeps this system reliable. You can start today. OpenAI and Anthropic already support prompt caching in their APIs. I have shared my recent article on prompt caching below if you want to dive deeper. Have you tried CAG in production yet? Below, I have quoted an article that I wrote on prompt cashing and how Claude Code achieves a 92% cache hit-rate. Give it a read.

GIF

Akshay 🚀@akshay_pachaar

x.com/i/article/2030…

English

302

1.5K

274.7K

Christophe Ponsart@cponsart·18 May

@garrytan @garrytan is there an available public dataset I can run my token-based memory system against to compare stats? I built a memory retrieval system that is 50x faster than vector embedding search and requires no database, no vector embeddings.

English

138

Garry Tan@garrytan·18 May

For personal AI scenarios against my 120k markdown brain ZeroEntropy has earned the top slot

English

11K

Garry Tan@garrytan·18 May

GBrain now ships with ZeroEntropy as the recommended default embedding and re-ranking option over OpenAI and Voyage AI.

English

478

102.8K

Christophe Ponsart@cponsart·16 May

Completely agree with this. Vector response time and interpretability are not the solution for ai agents. grep is the right direction being file system native but provides no assistance with discoverability outside of direct keyword search. I believe the solution is token-native, file system native (requiring no database), and the results of my test so far are blowing my mind. Let me know what you think of this approach? x.com/cponsart/statu…

Christophe Ponsart@cponsart

I made an interesting discovery that could potentially have a significant impact to the AI industry! I'm open sourcing the concept and code on github via MIT license in an effort to validate if this approach is meaningful or if I've lost my marbles. Watch the video and tell me what you think? Once you hear it, it's hard to un-hear... "Should agentic memory live in tokens and not vectors!" If you believe this idea, help me spread the word by replying here. Also, if you're a developer, please download/star the repo. Would love your input and feedback. This is my first open source project (so yes, I have no idea what I'm doing :) ContextFit retrieves the correct answer from your existing knowledge base and conversations without embedding APIs, vector databases and at 50x the speed, significantly improving agent workflows by reducing both time and cost.

English

268

elvis@omarsar0·15 May

// Is Grep All You Need? // Pay attention to this on, AI devs. (bookmark it) They find that grep-style text search, when wrapped in the right agent harness, matches or beats embedding-based retrieval on coding-agent tasks. Are vector databases even needed where this is all going? It might be that what coding agents needed was not better embeddings. It was better harness design around primitive tools. If you operate a coding-agent stack that depends on a vector DB, it might be time to re-evaluate. My personal experience on this has been that agentic search, if done right, is more than good enough for a lot of use cases. But you also have to understand how to properly index and structure information for the agents to take advantage. At scale, vector databases do shine so take that into account as well. In most cases, a hybrid approach often works best but that's something we haven't figured out really well as of yet. Paper: arxiv.org/abs/2605.15184 Learn to build effective AI agents in our academy: academy.dair.ai

English

639

79.2K

Christophe Ponsart@cponsart·16 May

Got my first GitHub star today. Tiny number. Weirdly meaningful. I’ve spent my whole career around consulting companies, where most of the best work happens behind client walls and never becomes open source. So seeing someone outside the room find a project useful enough to star it feels different. Small milestone, big feeling.

English

110

Christophe Ponsart@cponsart·16 May

I’m starting to write publicly about ContextFit, an open-source project I’ve been building around token-native memory retrieval for AI agents. First piece: why I think AI memory needs to move beyond “embed everything and hope the nearest chunks are enough.” AI memory should be fast, local, auditable, and built for the way models actually consume information: tokens.

Christophe Ponsart@cponsart

x.com/i/article/1963…

English

121

Christophe Ponsart@cponsart·16 May

x.com/i/article/1963…

ZXX

131

Christophe Ponsart@cponsart·16 May

Here are some example results comparing OpenClaw memory search and my ContextFit memory engine (MIT license). I’m noticing 50x speed improvement and twice the accuracy. @steipete I’ve open sourced the code if you want to spin up a test - would welcome feedback from your team

English

130

Christophe Ponsart@cponsart·15 May

English

129.3K

Christophe Ponsart@cponsart·16 May

The full length video here

English

138

Christophe Ponsart@cponsart·15 May

Website: context.fit Github: github.com/ContextFit/cf

English

167

Christophe Ponsart@cponsart·23 Nis

@Jason I’m in

English

@jason@Jason·23 Nis

We started an AI founder twitter group... reply with "I'm in" if you're a founder and want to be added

English

10.8K

134

4.6K

904.5K

Christophe Ponsart@cponsart·2 Şub

@farzyness @Tesla @openclaw @farzyness, Connect @openclaw to your @Tesla. Mine is now connected and can automate the last step in frictionless driving. Setting my destination for me. Now only 1-click involved “Drive with FSD”

English

105

Farzad 🇺🇸 🇮🇷@farzyness·2 Şub

If you have @Tesla's FSD and use @openclaw, your car literally drives itself while your AI agent literally does work on your behalf. lal

English

185

17.1K

Keşfet

@LearnWithBrij @henrytdowling @garrytan @elonmusk @BarackObama @taylorswift13 @cristiano @BillGates