Ryan McCormick
332 posts

Ryan McCormick
@RyanMcC35236715
"One must imagine Sisyphus happy" Kind Human. Optimist Prime. Pragmatic Systems Builder. ML researcher. Experienced Software Engineer. Mediocre Philosopher.


I'm fairly convinced there's some universal language manifold (= a surface formed by meaning vectors) that both humans and LLMs operate on. But we don't train LLMs to explicitly represent this manifold. We rather train them to approximate it, and to move along it by building curves on it. And those curves are reasoning in geometric terms, like a reasoning trace is a curve on a low-dimensional manifold embedded in a very high-dimensional space. The Linear Representation Hypothesis (arxiv.org/pdf/2311.03658) touches this, but I wonder if there's more recent work that takes the manifold idea further? Would love to see takes from people with serious differential geometry backgrounds on this!









Next-token prediction is myopic. What if transformers learn to predict their own next latent state? 🌠 We present 𝗡𝗲𝘅𝘁-𝗟𝗮𝘁𝗲𝗻𝘁 𝗣𝗿𝗲𝗱𝗶𝗰𝘁𝗶𝗼𝗻 (𝗡𝗲𝘅𝘁𝗟𝗮𝘁): a self-supervised learning method that teaches transformers to form compact world models for reasoning and planning. It also unlocks up to 3.3x faster inference via self-speculative decoding! 🚀


















