Sabitlenmiş Tweet
Open Compress
16 posts

Open Compress
@opencompress_ai
Better answers. Faster responses. Lower bills. Drop-in compression for any LLM, any agent, any gateway. We only win when you save. https://t.co/XaenzHBX2k
San Francisco Katılım Mart 2026
1 Takip Edilen1.5K Takipçiler

Lots of people asked how we track all these compression projects. We use attentionvc to monitor trending repos by category, super useful for staying on top of the space: github.attentionvc.ai/trending/repos…
English

@chopra_tejas @bc1beat @mksglu @tirth_8205 @jlehman_ @Nielsen777Brian @harel_nimrod @Compresr_yc_w26 @IvanZakazov @devindolar @janreges @0xnihilism @0xSero @ScryaHQ 100% agree, this is the biggest tradeoff. We actually run cache_stable mode in production for exactly this reason, skip dict and alias stages to keep prefixes deterministic for Anthropic's 90% cache discount.
English
Open Compress retweetledi

Thanks for the mention!
Note - in #headroom, we tried the same techniques - like dictionaries etc - BUT it destroys prefix caching.
So - folks should explore these techniques thinking not just about token compression BUT the impact to prefix caching.
Caching aware compression is key :)
English

@wei03_ SQuAD benchmarks don't capture real agent workloads at all. tool output (file reads, grep, test results) is structurally different.
English

@opencompress_ai 请问Layer 3 tool output compression这块,benchmark是用什么任务测的?agent真实跑monorepo的结果和SQuAD那种阅读理解差距应该挺大
中文

@mwixamwixa2 and the agent re-reads the same file 3 times and gets the same error in 4 messages lol. paying more tokens for worse output is the real problem
English

@opencompress_ai 500K tokens just debugging a monorepo... and we are all just quietly paying for it lol
English

@Ramdevgujj38411 that's the key question. our take is providers won't prioritize it, charging per token means compression is against their business model. same reason AWS didn't build Cloudflare
English

@opencompress_ai YC backing a token compression startup in 2026 makes complete sense, the unit economics on agents are brutal rn. only question is whether the model providers just build this natively and kill the whole category
English

@Shubham75450791 yeah early LLMLingua on code was rough. the key difference is content-aware stages, you can't just drop tokens by perplexity when it's code. AST-aware compression that never touches identifiers is a completely different game
English

@opencompress_ai I was pretty skeptical of the whole "train a small model to prune tokens" approach after seeing how bad early LLMLingua was on code. claw-compactor ROUGE-L numbers at that compression ratio are actually hard to argue with though
English

@lj_xbt Exactly, remove noise on both sides. Input cleaner, output sharper.
English

@opencompress_ai TLDR: Open Compress would cut down some bs that I told my AI, and then getting rid of some bs my AI is going to tell me 😂
English

The cost of tokens is getting cheaper and cheaper, but we use more and more, and the consumption is getting more and more expensive. Token compression is not just a random deletion of words. This is a complex process. It understands the importance of semantics and retains the information required by LLM to generate accurate responses.
English

@opencompress_ai @chopra_tejas @mksglu @tirth_8205 @jlehman_ @Nielsen777Brian @harel_nimrod @Compresr_yc_w26 @IvanZakazov @devindolar @janreges @0xnihilism @0xSero @ScryaHQ Love to be this community, Love everyone work on OpenSource.
English



