solocrew
8.1K posts

solocrew
@MySoloCrew
I actually make money with AI so I probably know what I am talking about. Army Combat Vet, ex-Big4, AI Researcher - These are my unfiltered thoughts

PSA: If you've been running out of Claude session quotas on Max tier, you're not alone. Read this. Some insane Redditor reverse engineered the Claude binaries with MITM to find 2 bugs that could have caused cache-invalidation. Tokens that aren't cached are 10x-20x more expensive and are killing your quota. If you're using your API keys with Claude this is even worse. This is also likely why this isn't uniform, while over 500 folks replied to me and said "me too", many (including me) didn't see this issue. There are 2 issues that are compounded here (per Redditor, I haven't independently confirmed this) : 1s bug he found is a string replacement bug in bun that invalidates cache. Apparently this has to do with the custom @bunjavascript binary that ships with standalone Claude CLI. The workaround there is to use Claude with `npx @anthropic-ai/claude-code` 2nd bug is worse, he claims that --resume always breaks cache. And there doesn't seem to be a workaround there, except pinning to a very old version (that will miss on tons of features) This bug is also documented on Github and confirmed by other folks. I won't entertain the conspiracy theories there that Anthropic "chooses" to ignore these bugs because it gets them more $$$, they are actively benefiting from everyone hitting as much cached tokens as possible, so this is absolutely a great find and it does align with my thoughts earlier. The very sudden spike in reporting for this, the non-uniform nature (some folks are completely fine, some folks are hitting quotas after saying "hey") definitely points to a bug. cc @trq212 @bcherny @_catwu for visibility in case this helps all of us.










The fact that memory stocks are crashing because of Google’s Turboquant is a pretty good indicator of how many clueless people this market is filled with. It’s like saying Aramco should crash because Toyota came out with a next-generation hybrid engine.

Introducing TurboQuant: Our new compression algorithm that reduces LLM key-value cache memory by at least 6x and delivers up to 8x speedup, all with zero accuracy loss, redefining AI efficiency. Read the blog to learn how it achieves these results: goo.gle/4bsq2qI



Wild that there was like 0 big AI news all week. Are we no longer accelerating?





I don't quite get it, is the 27B model smarter than the 35-A3B model?

Amazon uses Rust. Microsoft uses Rust. Google uses Rust. Cloudflare uses Rust. Discord uses Rust. Dropbox uses Rust. Figma uses Rust. Solana uses Rust. Polkadot uses Rust. NEAR uses Rust. Aptos uses Rust. Sui uses Rust. What’s stopping you from learning Rust?


You can now enable Claude to use your computer to complete tasks. It opens your apps, navigates your browser, fills in spreadsheets—anything you'd do sitting at your desk. Research preview in Claude Cowork and Claude Code, macOS only.









