

Norman Casagrande
2.4K posts

@nova77t
ML, history, space & sciencey stuff. Research Eng @ Google DeepMind. Opinions are my own etc. Find me @adabstract.bsky.social & @[email protected]



Producer is now part of Google! We’re proud to be joining @GoogleLabs and @GoogleDeepMind to build the future of music creation. Producer is here to stay, with more on the way. Come make music with us!


















One of the best visual explanations I've ever seen for why scaling Transformers works, but is suboptimal, as it's just brute-forcing things, by @YesThisIsLion (co-author of the Transformer) on @MLStreetTalk "In the (rejected) paper "Intelligent Matrix Exponentiation", they show the decision boundary of a classic MLP with a ReLu/Tanh activation function on the classic Spiral dataset." "You can see they both technically solve it with great scores on the test set. Next, they show the decision boundary of the "M-layer" they propose in the paper. And it represents the spiral ... as a spiral!" "Shouldn't we? If the data is a spiral... shouldn't we represent it as a spiral?" "If you look back at the decision boundaries of the MLP, it's clear that you just have these tiny, piecewise separations without learning the concept of a spiral. That's what I mean!" "If you train these things enough, it can fit the spiral and get a high accuracy. But there's no indication that the MLP actually understands a spiral. When you represent it as a spiral, it extrapolates correctly, cause the spiral just keeps going out."




these 2 prompts for Nano Banana Pro will save you a ton of time. just upload an image, generate the cinematic grid, and pull the frames you like! examples made in Higgsfield AI, and prompts below 👇