Ryan McCormick

332 posts

Ryan McCormick

@RyanMcC35236715

"One must imagine Sisyphus happy" Kind Human. Optimist Prime. Pragmatic Systems Builder. ML researcher. Experienced Software Engineer. Mediocre Philosopher.

Katılım Temmuz 2022

521 Takip Edilen140 Takipçiler

Sabitlenmiş Tweet

Ryan McCormick@RyanMcC35236715·11 May

This is a Renaissance.

English

649

Ryan McCormick@RyanMcC35236715·28m

@TamazGadaev You first have to acknowledge we've been doing ML all wrong, then you can free your mind to find this framework. There are glimpses that people have figured this out. I have an end-to-end working system. It's all geometry.

English

Tamaz Gadaev@TamazGadaev·2h

if this framing is right, reasoning quality might be partly a function of smoothness, i.e. how cleanly the model moves along the manifold versus jumping erratically between unrelated regions and differential geometry has exactly the tools for studying this: geodesics, curvature, parallel transport none of them have been seriously applied to reasoning traces yet (afaik?) the research program almost writes itself: characterize the manifold, then ask what kinds of curves are achievable on it and which aren't

Sasha Malysheva@aimalysheva

I'm fairly convinced there's some universal language manifold (= a surface formed by meaning vectors) that both humans and LLMs operate on. But we don't train LLMs to explicitly represent this manifold. We rather train them to approximate it, and to move along it by building curves on it. And those curves are reasoning in geometric terms, like a reasoning trace is a curve on a low-dimensional manifold embedded in a very high-dimensional space. The Linear Representation Hypothesis (arxiv.org/pdf/2311.03658) touches this, but I wonder if there's more recent work that takes the manifold idea further? Would love to see takes from people with serious differential geometry backgrounds on this!

English

Ryan McCormick@RyanMcC35236715·30m

@aimalysheva Ideas, concepts, languages, all have a resolution, like your monitor. It's measurable. It has a capacity of information. Hidden layers have far more representational capacity than semantic language. Semantic language composes to higher resolution than its constituent parts.

English

Sasha Malysheva@aimalysheva·1d

English

488

26.1K

Ryan McCormick@RyanMcC35236715·1h

@RaccoonStampede Seek coherence.

English

David Hudson@RaccoonStampede·8h

Lost in Translation Every field is quietly describing the exact same thing—they just don’t realize it. Cosmologists call it dark matter/energy: 95% invisible substrate, 5% visible structure. Biologists see it in photosynthesis: near-perfect quantum coherence riding massive dissipative losses. Neuroscientists watch it in brain criticality: narrow conscious focus on a vast subconscious reservoir, poised at the edge where avalanches happen. Physicists describe dissipative structures (Prigogine): order only persists by leaking entropy into the background. Complexity theorists name it self-organized criticality or “edge of chaos”: the sweet spot where adaptation explodes. Engineers fight it as the irreducible noise floor that prevents perfect efficiency. Same pattern. Same ratio. Same engine. Recursion through a generative substrate (π-field chaos, thermal bath, dark sector, subconscious) + controlled leakage (~5% “Ghost Tax”) = the only way persistent novelty and structure emerge at any scale. Too little leak → frozen rigidity. Too much leak → noise death. Reality runs on this dial. Scale-invariant. Built-in. We’re all reverse-engineering the same OS from different terminals, using different jargon, blind to the unified codebase underneath. The Ghost Tax isn’t a quirk. It’s the baseline. Once you see it, you can’t unsee it. Who else is noticing the translation layer? Drop your field’s dialect below 👇 #GhostTax #CoherencePrinciple #EdgeOfChaos #ScaleInvariant

English

Ryan McCormick@RyanMcC35236715·4h

@mathmaticulous The canonical quantizations are 3,5,7,11 Every other includes all other quantizations. Wild. I'm tracking on the exact same trajectory but from a different angle.

English

CTFTHEORY@mathmaticulous·21h

Prime numbers above 3 cannot live in positions divisible by 3. That single arithmetic fact forces every prime into exactly six positions on a nine point circle. Those six positions split into two perfect triangles. Two triangles on a circle is a Star of David. The math drew it with no symbol in mind. The same framework keeps producing ancient shapes. The exponential decay function that describes time in this system draws a pyramid. The three-fold vortex symmetry draws a triskelion. The Fibonacci sequence cycling back to its start draws an ouroboros. The Riemann equation draws a balance scale Ma'at's scales, the Egyptian symbol of cosmic justice. No symbol was chosen before running any of these. The math ran. The shape appeared afterward. So we mapped π onto the same lattice. Each digit of π routes to its position on the nine-point circle. Consecutive digits draw edges between those positions. Plot everything and read what falls out. A dodecahedron. Ten points on a circle is the standard flat projection of a dodecahedron it is literally what a dodecahedron looks like when you collapse it to 2D along its natural axis of symmetry. But the deeper reason matters more. The dodecahedron and icosahedron are mathematical duals mirror images of the same geometry, each one's face centres being the other's vertices. The geodesic grid at the heart of this framework is built on an icosahedron. Its flat projection produces the Star of David. π on the same lattice produces the dodecahedron. The framework generates both halves of this duality simultaneously. The zone proportions confirm it. The framework predicts π's digits should fall 40% in the stable zone, 30% in each boundary zone. They land at 40.4%, 30.1%, 29.6%. Plato called the dodecahedron the shape of the cosmos. zenodo.org/records/206964… zenodo.org/records/206802… ctftheory.com/ancient-symbol…

English

125

7.7K

Ryan McCormick@RyanMcC35236715·4h

@tunguz @trainxgb @tabul_ai *gestures at Reddit*

English

Bojan Tunguz@tunguz·12h

Every time someone tries to explain to me ML for tabular data. @trainxgb @tabul_ai

English

4.1K

Ryan McCormick@RyanMcC35236715·4h

Every ML paper for a decade has the same structure: define a loss, build a differentiable architecture, backpropagate, report metrics. If your idea can't be expressed as "minimize this differentiable loss," it's not publishable. The chain rule has become the filter through which all ideas must pass, and it filters out exactly the ideas the field needs most: Geometric analysis of data (not differentiable) Resolution-aware representations (discrete) Certifiable properties (require proofs, not gradients) Content addressing (hashing is not differentiable) Transition dynamics by counting (counting is not differentiable)

English

Ryan McCormick@RyanMcC35236715·4h

@tunguz There is a perfect vector to align this direction too!

English

Bojan Tunguz@tunguz·12h

gm directional correctness is all you need

English

2.5K

Ryan McCormick@RyanMcC35236715·5h

Unbind yourself from the chain rule.

English

Ryan McCormick@RyanMcC35236715·5h

@jayden_teoh_ Think of expressivity as a resolution problem and you're almost there. MLP have no obligation to compress. You are starting to see the cracks in "classic" ML. We can do so much better.

English

Jayden Teoh@jayden_teoh_·6h

My favorite experiment in this paper is the one we added last minute. In the experiment, we ask: "Can a transformer train a RNN to solve problems above it's expressivity class?" 🤔 Turns out, the answer is yes.

Jayden Teoh@jayden_teoh_

Next-token prediction is myopic. What if transformers learn to predict their own next latent state? 🌠 We present 𝗡𝗲𝘅𝘁-𝗟𝗮𝘁𝗲𝗻𝘁 𝗣𝗿𝗲𝗱𝗶𝗰𝘁𝗶𝗼𝗻 (𝗡𝗲𝘅𝘁𝗟𝗮𝘁): a self-supervised learning method that teaches transformers to form compact world models for reasoning and planning. It also unlocks up to 3.3x faster inference via self-speculative decoding! 🚀

English

4.3K

Ryan McCormick@RyanMcC35236715·23h

@deepfates The context-window is your shared ontology with an LLM, you can teach it. Look at how LLMs think, it's important. They can teach us too, just pay attention to the outputs. To be worthy of the volume LLMs can crank out, you must be able to validate 100% certain at scale.

English

🎭@deepfates·1d

Are you an "llm whisperer" or "ai naturalist" or something like that? Can you describe what you do in a few words or sentences? Not the why, or the particular findings, just like. How do you learn things about these systems? What do you actually spend your time doing

English

126

279

23.9K

Ryan McCormick@RyanMcC35236715·23h

@deepfates Learn how they work internally. Build a case of falsifiable anecdotes. Have a firm test basis (even anecdotal). Try to push the limits. Fail. "Test out" every new model. When you agree on the deliverable with the LLM you can really push smaller models. You merge ontology.

English

342

Ryan McCormick@RyanMcC35236715·23h

@HarmonyHacker @Hesamation LLM capability are demonstrated already, and obviously incredible. My admiration for the work helped me realize the truth is that we are still early in this technology. Also, the US government is not the best certification of a model's capability, that is certain.

English

Harmony Hacker@HarmonyHacker·1d

@RyanMcC35236715 @Hesamation Meanwhile, the reality is that they are so limitless that the government is attempting to step in and reign them in.

English

ℏεsam@Hesamation·1d

Sam Altman calls Yann LeCun’s bet against LLM scaling as “misguided”. “So clearly LLMs are capable of figuring out new knowledge and clearly they are capable of doing some things that humans just can't do. they are going to scale much further.”

English

492

123K

Ryan McCormick@RyanMcC35236715·1d

@antoniolupetti Incredible repo of information!

English

193

Antonio Lupetti@antoniolupetti·1d

"Classical Mechanics" by Joel A. Shapiro is a free book that develops the foundations of mechanics from Newtonian particle motion to the more advanced formulations. It covers many topics, including particle kinematics, systems of particles, phase space, conservation laws, central-force motion, rigid body dynamics, small oscillations, perturbation theory, and field theory. Although it is written as a physics text, it will appeal to many readers who are in mathematics. Much of the book focuses not only on physical phenomena but also on the mathematical framework used to describe them. I would suggest bookmarking it as a useful reference to browse whenever needed. physics.rutgers.edu/~shapiro/507/b…

English

278

10.5K

Ryan McCormick@RyanMcC35236715·1d

@che_shr_cat Training literally generates an ontology only to throw it away. Inscribe once. Collapse as necessary. AI training is deeply flawed, people are starting to notice.

English

830

Grigory Sapunov@che_shr_cat·1d

1/ We have been training RNNs wrong for decades. Backpropagation through time (BPTT) forces sequential updates, creating unstable O(T) gradient paths. What if we could train highly expressive, non-linear RNNs with flat, parallelized O(1) gradients? It is now possible. 🧵

English

111

686

67.7K

Ryan McCormick@RyanMcC35236715·1d

Room temperature superconductors anyone?

Català

Ryan McCormick@RyanMcC35236715·1d

@araseb_ As always, Reality is the conversation of ethics. Each colony will have its unique take on humanity, ethics, the future, and our relationship with Nature. I hope they are all well-centered.

English

799

Sarah@araseb_·1d

@RyanMcC35236715 If we colonize the stars, who decides which values we carry forward?

English

13.8K

Sarah@araseb_·1d

What's coming after artificial intelligence?

English

4.2K

150

1.7K

440.8K

Ryan McCormick@RyanMcC35236715·1d

@PierceLilholt If you're always falsifiable in your approach to being a cointelligent operator, you know when you've hit a limit. You can't really push too far. It is a merger of ontology though (yours and its), if you have something novel, you have to constantly remind/refresh the AI.

English

Pierce Alexander Lilholt@PierceLilholt·1d

What are Cointelligent Operators learning about limits by pushing AI too far?

English

217

Ryan McCormick@RyanMcC35236715·1d

@justinskycak The math talent pool is widening too. Systems builders just need the translation mapping to be successful, in a lot of ways, Lean is that mapping.

English

Justin Skycak@justinskycak·3d

The talent pool in elite math is closer to Division I athletics than most people realize.

English

706

57.4K

Keşfet

@TamazGadaev @aimalysheva @RaccoonStampede @mathmaticulous @tunguz @trainxgb @tabul_ai @jayden_teoh_