Vlad Mnih
44 posts










In our new work - Algorithm Distillation - we show that transformers can improve themselves autonomously through trial and error without ever updating their weights. No prompting, no finetuning. A single transformer collects its own data and maximizes rewards on new tasks. 1/N

We have built CommonSim-1, a neural simulation engine controlled by images, actions and text. This is a new era in generative AI where simulators will grow and adapt with experience. Read on for more and sign up for early access to apps/API/open-source: csm.ai/commonsim-1-ge…





Unsupervised skill learning allows RL agents to learn useful behaviors in the absence of task-specific rewards. But beneath its promising facade lies a dark secret… a challenging exploration problem! Check out our #iclr2022 Spotlight to learn more: arxiv.org/abs/2107.14226. 🧵⬇️



