b/acc, context platform engineer
227.8K posts

b/acc, context platform engineer
@AccBalanced
AI Factories. Balanced Accelerationist. WEKA CAIO, CNCF kubernetes founding board, Post-PKI.







NEW: Exclusive Interview with Jaimin Rangwalla, Chief Investment Officer of Public Investments at Coatue In @coatuemgmt's Spring 2026 Investor Update, Jaimin walks through the unexpected winners of the AI cycle: memory, optical, CPUs, & the infrastructure layer quietly outperforming the Mag 7. We cover: - Why Coatue is "following the gigawatts" - Private companies breaking into the global top 25 pre-IPO (OpenAI, Anthropic, SpaceX) - Cash flow transferring from hyperscalers to AI infrastructure - The $12T funding engine behind the AI buildout - Sellers of shortage vs. buyers of shortage - The Token Economy - The CPU/GPU flip reshaping compute demand - Coatue's $6T+ AI market estimate - Agents launching agents / "1,000 analysts working 24/7" Read the full deck & watch the update replay below 𝐓𝐈𝐌𝐄𝐒𝐓𝐀𝐌𝐏𝐒 (00:00) Jaimin Rangwalla, CIO of Public Investments at Coatue (00:56) Inside Coatue HQ (02:48) Investor Update Kickoff (04:36) Mapping the AI Stack (06:02) Why Supply Stays Tight (07:03) How Jaimin's Became CIO (10:43) Private Giants vs Mag 7 (12:40) Market Breadth and Reordering (15:24) Where AI Revenue Comes From (17:04) Tokens and Economy (19:43) Agents Change Everything (21:58) OpenClaw Explained (24:49) Memory Demand Explosion (27:12) Architecture Shifts Ahead (27:24) Agents Gain Memory (27:58) CPU Demand Surge (28:38) CPU GPU Ratio Flip (30:21) Key Chip Players (30:45) Intel Comeback Thesis (31:41) Semis Go Mainstream (33:24) Nvidia Mania and GTC (33:59) Tracking Data Center Buildouts (35:21) Jobs Lost and Created (37:30) Sellers Versus Buyers (40:54) Optical Breakouts (41:27) Bottlenecks Everywhere (44:48) Sentiment Versus Fundamentals (47:10) Handling Volatility (49:17) Finding New Leaders (51:18) Trillion Dollar IPOs (52:48) Risks and Disruptions (55:00) Coatue Growth Story (55:58) Staying Curious to Win




seems twitter missed the ExploitBench paper? few observations: we finally got good data on Mythos security capabilities and it's very impressive. Mythos got full exploit chain on 18/41 v8 n-days, while gpt 5.5 only got 1 and open source models are mostly useless.



@beffjezos Our recently completed Grok V9 1.5T run is looking great and that is before Cursor data is added in supplemental training

This is how the algorithm can completely destroy your reach over night. This is the last: Left: 3 months Right: 2 weeks Super consistent 85-95% drop on all metrics. everything after a viral post going ballistic, I tried everything, cool down, delete low quality posts, block bot accounts. Kept posting after cool down, nothing really breaks through. Short hot takes 🛑 Long form with good signal 🛑 Viral potential post 🛑 Core audience value post 🛑 What bothers me here is that 48h after posting a mega viral post I get suppressed back to the Stone Age. This follow previous situations I’ve had with the grok powered algorithm. Where it feels like tweepCred falls far below a certain level, and you’re locked into a low reach prison with every effort to break out is making it harder and harder to do so. I’m asking for transparency on what we can do as content creators when this happens. I don’t want to spam my way out of this. I’d like to know, if I did something wrong, how I can address it, take the responsibility of algorithmic suppression for what ever the length is. But this limbo is most likely going to make me leave the platform.

NVIDIA has done the impossible and nobody's talking about it. They trained a 12 BILLION parameter LLM in 4-bit precision on 10 trillion tokens. For years, the AI industry has been stuck. If you wanted to train a world-class AI, you had to use 16-bit or 8-bit precision. Going lower to 4-bit, was a death sentence for the model. It would become unstable, "hallucinate" its own math, and eventually collapse. But NVIDIA proved that "impossible" was just a math problem. They used a new format called NVFP4. Instead of a standard, rigid structure, NVFP4 uses "micro-scaling." It groups numbers into tiny blocks and applies individual scaling factors to each one. It’s like giving the AI a pair of high-definition glasses for its own data, allowing it to see fine details even with 75% less memory. The result is a total paradigm shift: - 2× to 3× faster arithmetic performance. - 50% reduction in memory usage. - Near-zero loss in intelligence. The researchers compared the 4-bit model against a massive 8-bit baseline. The curves are identical. On MMLU, GSM8K, and coding benchmarks, the "tiny" 4-bit version performed within 0.1% of the more expensive model. This is an economic earthquake. Training a frontier model used to require tens of thousands of GPUs and months of time. NVIDIA just showed we can get the same results with half the hardware and a fraction of the electricity.


Today we release Lighthouse Attention, a selection-based hierarchical attention for long-context pre-training that delivers a 1.4-1.7× wall-clock speedup at 98K context. It runs the same forward+backward pass ~17× faster than standard attention at 512K context on a single B200, without a custom sparse attention kernel, a straight-through estimator, or an auxiliary loss. During training, queries, keys, and values are pooled symmetrically into a multi-resolution pyramid. We then score every pyramid heads, and a top-k cascade selects a small hierarchical dense sub-sequence, and after a sorting pass that enforces causality, we use standard attention for token mixing. A brief full attention resume at the end converts the checkpoint back into a competent dense-attention model. Validated this using 530M parameter Llama-3 models across 50B tokens, with up to 1M-token benchmarks across 32 B200s under context parallelism. The work on Lighthouse Attention was led by @bloc97_, @SubhoGhosh02, and @theemozilla.





