
0xYieldFarmooor
1.8K posts

0xYieldFarmooor
@0xYieldFarmooor
TradFi investor turned crypto degen and yield farmooor | Prev: GS, CeFi | Views are my own, not FA, DYOR





Y’all remember when Native Markets was more ecosystem aligned and Paxos + Paypal couldn’t get it over the line Maybe Charles Schwab builder codes driven by Paxos integration would have made for a better long term vision because the name of the game in stables is DISTRIBUTION





Schwab Crypto™ is on the way. Schwab Crypto accounts (offered by Charles Schwab Premier Bank, SSB) will provide direct access to Bitcoin and Ethereum trading, in-depth digital assets education, and more. Read the full press release: brnw.ch/21x1F79


Pretty wild to see our work on PULSE show up in a real 1T-scale post-training run done by @cursor_ai. Cursor built Composer 2 in collaboration with Fireworks and trained it across multiple datacenters, getting huge savings by syncing only the weights that actually changed between RL checkpoints. Fireworks reports that more than 98% of BF16 weights can stay bit-identical from one checkpoint to the next, and they cited our paper on this, too. That is basically the exact sparsity pattern we showed in our paper, where we introduced PULSE, a lossless method for 100x more efficient weight-sync communication for RL training. Their system is very close to this idea in practice: exploiting the fact that only a tiny fraction of weights actually change between RL steps. The deeper reason for this is not that RL gradients are sparse. They are not. The gradients are still dense. What becomes sparse is the realized weight update. In RL, learning rates are tiny, and with Adam, the update size stays bounded around the learning rate. Then BF16 adds a hard threshold: if the update is too small relative to the weight, it just rounds away, and the stored weight does not change at all. So from one checkpoint to the next, most of the model literally stays identical. That is why this is such a useful systems idea. Lower precision, like using BF16, does not just save compute. It can also save communication, because more tiny updates get absorbed and fewer weights need to be shipped. At that point, compute efficiency and comms efficiency stop being a tradeoff. They start reinforcing each other. If you want the deeper story on why RL updates get this sparse, the theory behind it, and how to push weight-sync bandwidth down by 100x+, take a look at our paper: arxiv.org/pdf/2602.03839 The Fireworks blog on Composer 2 that cited our work: fireworks.ai/blog/frontier-… The animation is taken from Fireworks!




$TAO is ALL of AI The full tech stack And the institutions can't buy it yet I met with them at the @MessariCrypto 2024 Thesis event in NYC last night There's no liquidity for them to buy it yet in size No tier 1 exchanges Remember that






