Robert Youssef@rryssf
Your AI has been quietly forgetting everything you told it.
Not randomly. Not loudly. Systematically. Starting with the decisions that matter most.
> The constraint you set three months ago "never use Redis, the client vetoed it after a production incident." Gone. The GDPR deployment region restriction. Gone. The retry limit you tested empirically after the cascade failure. Gone.
> The model never told you. It just started using defaults.
> This is called context rot. And Cambridge and Independent researchers just quantified exactly how bad it is.
> Every production AI system that runs long enough will eventually compress its context to make room for new information. That compression is catastrophically lossy. They tested it directly: 2,000 facts compressed at 36.7× left 60% of the knowledge base permanently irrecoverable. Not hallucinated. Not wrong. Just gone. The model honestly reported it didn't have the information anymore.
> Then they tested something worse. They embedded 20 real project constraints into an 88-turn conversation the kind of constraints that emerge naturally in any long-running project then applied cascading compression exactly like production systems do. After one round: 91% preserved. After two rounds: 62%. After three rounds: 46%.
> The model kept working with full confidence the entire time. Generating outputs that violated the forgotten constraints. No error signal. No warning. Just silent reversion to reasonable defaults that happened to be wrong for your specific situation.
> They tested this across four frontier models. Claude Sonnet 4.5, Claude Sonnet 4.6, Opus, GPT-5.4. Every single one collapsed under compression. This isn't a model problem. It's architectural.
→ 60% of facts permanently lost after single compression pass
→ 54% of project constraints gone after three rounds of cascading compression
→ GPT-5.4 dropped to 0% accuracy at just 2× compression
→ Even Opus retained only 5% of facts at 20× compression
→ In-context memory costs $14,201/year at 7,000 facts vs $56/year for the alternative
The AI labs know this. Their solution is bigger context windows. A 10M-token window is a larger bucket. It's still a bucket. Compaction is inevitable for any long-running system. The window size only determines when the forgetting starts not whether it happens.