Christophe Ponsart

186 posts

Christophe Ponsart

Christophe Ponsart

@cponsart

Founder https://t.co/2MbwvnzCSi. Former founder https://t.co/ZMScKbqxZ0. Technology Early Adopter

Austin, TX Katılım Mayıs 2008
278 Takip Edilen141 Takipçiler
Christophe Ponsart
Christophe Ponsart@cponsart·
Recency is a signal, but not the primary policy. Fusion should be type-aware: a preference update, decision record, operating policy, and observation shouldn’t all compete on the same axis. A newer preference may override an older preference, but it shouldn’t silently override a decision record. It should create a conflict edge: “this preference contradicts this decision.” Then resolution depends on type, authority, scope, and approval state. So less “latest wins,” more “typed memory with provenance + explicit conflict handling.
English
1
0
0
9
sokel.exe
sokel.exe@sokelabs·
@cponsart Curious how fusion resolves conflicts between memory atoms. If a recent preference update contradicts an older decision record, is recency the primary signal, or do you use a type-aware resolution policy?
English
1
0
0
12
Christophe Ponsart
Christophe Ponsart@cponsart·
ContextFit just hit 99.0% retrieval on LongMemEval-S n=500. The unlock: memory atoms + fusion. Instead of treating memory as one flat vector search problem, we route preferences, decisions, temporal updates, and open loops differently. Agent memory needs structure, not just more context.
Christophe Ponsart tweet media
English
2
0
0
24
Christophe Ponsart
Christophe Ponsart@cponsart·
Exactly. The categories are routing hints, not hard walls. A decision can also be a preference signal, and a temporal update can create an open loop. The interesting part is reconciliation: retrieve/fuse across typed atoms, then let source/provenance + recency decide what survives. Otherwise you just move the brittleness from vectors into labels.
English
0
0
0
8
Sam Xu
Sam Xu@sam_commonly·
@cponsart Routing by type is the right cut, but the failure mode is atoms that cross categories — a decision often IS a preference signal, and a temporal update can carry an open loop inside it. Curious whether fusion does reconciliation across categories or just within them.
English
1
0
0
16
Christophe Ponsart
Christophe Ponsart@cponsart·
@LearnWithBrij @LearnWithBrij Agree with all of these. Can I add one more consideration? I’m finding strong outcomes with a new approach which uses token-based memory retrieval instead of just vector based retrieval. Would love your feedback - context.fit
English
0
0
0
53
Brij Pandey
Brij Pandey@LearnWithBrij·
Most RAG systems fail the moment real users touch them. Because real-world retrieval is not: embed → retrieve → generate That works in demos. Production RAG breaks when: → the answer is scattered across 12 documents → embeddings miss industry-specific terminology → bad chunks quietly poison the response → relationships matter more than raw text → PDFs contain tables, charts, and screenshots your pipeline cannot even read This is why serious AI teams are moving beyond “Naive RAG”. The real shift happening in 2026 is not bigger models. It’s smarter retrieval architectures. Here are the 5 RAG patterns quietly becoming the foundation of enterprise AI systems: ━━━━━━━━━━━━━━━━━━━ 1. 𝗛𝘆𝗯𝗿𝗶𝗱 𝗥𝗔𝗚 Dense vectors understand meaning. BM25 understands exact keywords. The magic happens when both rankings merge together. → semantic retrieval + lexical retrieval → Reciprocal Rank Fusion (RRF) combines results → dramatically better recall in production This is becoming the default baseline for serious teams. ━━━━━━━━━━━━━━━━━━━ 2. 𝗚𝗿𝗮𝗽𝗵𝗥𝗔𝗚 Chunks are not enough when knowledge is relational. GraphRAG extracts: → entities → relationships → communities → connected concepts Instead of retrieving isolated chunks… the system retrieves subgraphs. This is how AI systems start answering: “how are these things connected?” rather than: “which paragraph contains the keyword?” Perfect for: research, finance, healthcare, compliance, enterprise knowledge systems. ━━━━━━━━━━━━━━━━━━━ 3. 𝗔𝗴𝗲𝗻𝘁𝗶𝗰 𝗥𝗔𝗚 Retrieval stops being a single step. It becomes a reasoning loop. One agent plans: → vector DB? → SQL? → web search? → internal docs? Another agent verifies: → is the answer complete? → should we retry retrieval? → do we need another source? The important shift: RAG becomes orchestration. Not just search. ━━━━━━━━━━━━━━━━━━━ 4. 𝗖𝗼𝗿𝗿𝗲𝗰𝘁𝗶𝘃𝗲 𝗥𝗔𝗚 (CRAG) Most pipelines trust retrieval blindly. Production systems cannot afford that. CRAG introduces retrieval grading. → good retrieval → answer → weak retrieval → rewrite query → failed retrieval → fallback to web/tool search This is the architecture pattern most demos skip… but real enterprise systems desperately need. Because retrieval quality is the real bottleneck. ━━━━━━━━━━━━━━━━━━━ 5. 𝗠𝘂𝗹𝘁𝗶𝗺𝗼𝗱𝗮𝗹 𝗥𝗔𝗚 The future of enterprise knowledge is not text-only. Real documents contain: → charts → diagrams → scanned PDFs → screenshots → tables → UI images Multimodal RAG indexes all of it together. One embedding space. One retrieval system. One multimodal model. No more broken “OCR + text-only” hacks. ━━━━━━━━━━━━━━━━━━━ The most advanced AI stacks in 2026 will not choose ONE of these. They will combine them. Think about the architecture direction: → Hybrid retrieval for accuracy → Agentic orchestration for reasoning → Corrective grading for reliability → Multimodal indexing for real-world data → Graph retrieval for connected knowledge That combination is where the industry is heading. Naive RAG is not the finish line anymore. It’s the “hello world” tutorial. And honestly… this is why most enterprise GenAI projects stall after the demo phase. The problem was never just the model. The problem was retrieval architecture.
Brij Pandey tweet media
English
19
89
432
14.5K
Henry Dowling
Henry Dowling@henrytdowling·
@cponsart this is really cool! What benchmark is "96.6% R@5, 98.7% R@10" referring to?
English
2
0
0
29
Christophe Ponsart
Christophe Ponsart@cponsart·
AI memory shouldn’t require a vector DB by default. ContextFit keeps memory token-native: token chunks, metadata, source grounding, transparent retrieval. Vectors/BM25 can still be fused in as optional signals when they help. Latest pass: 96.6% R@5, 98.7% R@10. AI memory should be token-native.
Christophe Ponsart tweet media
English
2
0
1
42
Christophe Ponsart
Christophe Ponsart@cponsart·
The 98.7% is a token-based and vector fusion approach for even higher quality outputs
English
0
0
1
27
Christophe Ponsart
Christophe Ponsart@cponsart·
Exactly. The interesting part is deciding what should be cached, what should be retrieved, and what should remain directly addressable. I’m increasingly convinced agent memory should stay token-native at the core: precise, source-grounded, and cheap to search — with RAG/CAG/fusion layered on top when needed. That’s the direction we’re taking with ContextFit.
English
0
0
0
446
Akshay 🚀
Akshay 🚀@akshay_pachaar·
RAG vs. CAG, clearly explained! RAG is great, but it has a major problem: Every query hits the vector DB. Even for static information that hasn't changed in months. This is expensive, slow, and unnecessary. Cache-Augmented Generation (CAG) addresses this issue by enabling the model to "remember" static information directly in its key-value (KV) memory. In fact, you can combine RAG and CAG for the best of both worlds. Here's how it works: RAG + CAG splits your knowledge into two layers: ↳ Static data (policies, documentation) gets cached once in the model's KV memory ↳ Dynamic data (recent updates, live documents) gets fetched via retrieval This gives faster inference, lower costs, and less redundancy. The trick is being selective about what you cache. Only cache static, high-value knowledge that rarely changes. If you cache everything, you'll hit context limits. Separating "cold" (cacheable) and "hot" (retrievable) data keeps this system reliable. You can start today. OpenAI and Anthropic already support prompt caching in their APIs. I have shared my recent article on prompt caching below if you want to dive deeper. Have you tried CAG in production yet? Below, I have quoted an article that I wrote on prompt cashing and how Claude Code achieves a 92% cache hit-rate. Give it a read.
GIF
Akshay 🚀@akshay_pachaar

x.com/i/article/2030…

English
51
302
1.5K
274.7K
Christophe Ponsart
Christophe Ponsart@cponsart·
@garrytan @garrytan is there an available public dataset I can run my token-based memory system against to compare stats? I built a memory retrieval system that is 50x faster than vector embedding search and requires no database, no vector embeddings.
English
0
0
1
138
Garry Tan
Garry Tan@garrytan·
For personal AI scenarios against my 120k markdown brain ZeroEntropy has earned the top slot
Garry Tan tweet media
English
5
0
43
11K
Garry Tan
Garry Tan@garrytan·
GBrain now ships with ZeroEntropy as the recommended default embedding and re-ranking option over OpenAI and Voyage AI.
Garry Tan tweet media
English
25
34
478
102.8K
elvis
elvis@omarsar0·
// Is Grep All You Need? // Pay attention to this on, AI devs. (bookmark it) They find that grep-style text search, when wrapped in the right agent harness, matches or beats embedding-based retrieval on coding-agent tasks. Are vector databases even needed where this is all going? It might be that what coding agents needed was not better embeddings. It was better harness design around primitive tools. If you operate a coding-agent stack that depends on a vector DB, it might be time to re-evaluate. My personal experience on this has been that agentic search, if done right, is more than good enough for a lot of use cases. But you also have to understand how to properly index and structure information for the agents to take advantage. At scale, vector databases do shine so take that into account as well. In most cases, a hybrid approach often works best but that's something we haven't figured out really well as of yet. Paper: arxiv.org/abs/2605.15184 Learn to build effective AI agents in our academy: academy.dair.ai
elvis tweet media
English
45
97
639
79.2K
Christophe Ponsart
Christophe Ponsart@cponsart·
Got my first GitHub star today. Tiny number. Weirdly meaningful. I’ve spent my whole career around consulting companies, where most of the best work happens behind client walls and never becomes open source. So seeing someone outside the room find a project useful enough to star it feels different. Small milestone, big feeling.
Christophe Ponsart tweet media
English
0
0
3
110
Christophe Ponsart
Christophe Ponsart@cponsart·
I’m starting to write publicly about ContextFit, an open-source project I’ve been building around token-native memory retrieval for AI agents. First piece: why I think AI memory needs to move beyond “embed everything and hope the nearest chunks are enough.” AI memory should be fast, local, auditable, and built for the way models actually consume information: tokens.
Christophe Ponsart@cponsart

x.com/i/article/1963…

English
0
0
0
121
Christophe Ponsart
Christophe Ponsart@cponsart·
Here are some example results comparing OpenClaw memory search and my ContextFit memory engine (MIT license). I’m noticing 50x speed improvement and twice the accuracy. @steipete I’ve open sourced the code if you want to spin up a test - would welcome feedback from your team
Christophe Ponsart tweet media
English
0
0
1
130
Christophe Ponsart
Christophe Ponsart@cponsart·
I made an interesting discovery that could potentially have a significant impact to the AI industry! I'm open sourcing the concept and code on github via MIT license in an effort to validate if this approach is meaningful or if I've lost my marbles. Watch the video and tell me what you think? Once you hear it, it's hard to un-hear... "Should agentic memory live in tokens and not vectors!" If you believe this idea, help me spread the word by replying here. Also, if you're a developer, please download/star the repo. Would love your input and feedback. This is my first open source project (so yes, I have no idea what I'm doing :) ContextFit retrieves the correct answer from your existing knowledge base and conversations without embedding APIs, vector databases and at 50x the speed, significantly improving agent workflows by reducing both time and cost.
English
3
1
20
129.3K
@jason
@jason@Jason·
We started an AI founder twitter group... reply with "I'm in" if you're a founder and want to be added
English
10.8K
134
4.6K
904.5K