
We’re back with the BFN story in Episode 2 of the Let’s Talk Research podcast. Hear Alex Graves dive into real-world applications of Bayesian Flow Networks, including their use in protein sequencing and antibody design. 🧵⬇️
Alex Laterre
261 posts

@AlexLaterre
stealth | ex-Head of Research @ Instadeep

We’re back with the BFN story in Episode 2 of the Let’s Talk Research podcast. Hear Alex Graves dive into real-world applications of Bayesian Flow Networks, including their use in protein sequencing and antibody design. 🧵⬇️

Today we're putting out an update to the JAX TPU book, this time on GPUs. How do GPUs work, especially compared to TPUs? How are they networked? And how does this affect LLM training? 1/n


Scaling AI research agents is key to tackling some of the toughest challenges in the field. But what's required to scale effectively? It turns out that simply throwing more compute at the problem isn't enough. We break down an agent into four fundamental components that shape its behavior, regardless of specific design or implementation choices: - Environment: The context (infrastructure) in which the agent operates - Search Policy: How the agent allocates resources - Operator Set and Policy: The available actions the agent can take and how it chooses among them - Evaluation Mechanism: How the agent determines whether a particular direction is promising We specifically focus on ML research agents tasked with real-world machine learning challenges from Kaggle competitions (MLE-bench). What we found is that factors like the environment, the agents’ core capabilities (the operator set), and overfitting emerge as critical bottlenecks long before computational limitations come into play. Here are our key insights: 🔹Environment: Agents can't scale without a robust environment that offers flexible and efficient access to computational resources. For instance, simply running the baseline agents in the (open-sourced) AIRA-dojo environment boosts performance by 10% absolute (30% relative)—highlighting just how crucial the environment is. 🔹Agent design and core capabilities: Resource allocation optimization only matters if agents can actually make good use of those resources. Our analysis shows that the agents’ operator set—the core actions they perform—can limit performance gains from more advanced search methods like evolutionary search and MCTS. We achieve SoTA performance by designing an improved operator set that better manages context and encourages exploration, and coupling it with the search policies. 🔹Evaluation: Accurate evaluation of the solution space is critical and reveals a significant challenge: overfitting. Ironically, agents that are highly effective at optimizing perceived values tend to be more vulnerable to overfitting—a problem that intensifies with increased compute resources. We observe up to 13% performance loss due to suboptimal selection of final solutions caused by this issue. 🔹Compute: Providing agents with sufficient compute resources is essential to avoid introducing an additional limitation and bias into evaluations. We demonstrate this through experiments in which we scale the runtime from 24 to 120 hours. In summary, successfully scaling AI research agents requires careful attention to these foundational aspects. Ignoring them risks turning scaling efforts into, at best, exercises in overfitting. These insights set the stage for exciting developments ahead!







📆 Six months, four publications, one cover. We’re halfway through the year and InstaDeep research is powering ahead with multiple boundary-pushing papers published in the @NaturePortfolio! Catch up on the highlights below 🔽









We’ve talked routing—now let’s dive into placement! Discover how DeepPCB Placement elevates your PCB design process 🚀 Real-world use-cases showing why placement + routing = winning combo🏅 👉 bit.ly/4jxGwyL


Human generated data has fueled incredible AI progress, but what comes next? 📈 On the latest episode of our podcast, @FryRsquared and David Silver, VP of Reinforcement Learning, talk about how we could move from the era of relying on human data to one where AI could learn for itself. Watch now → 00:00 Introduction 01:50 Era of experience 03:45 AlphaZero 10:19 Move 37 15:20 Reinforcement learning and human feedback 24:30 AlphaProof 29:50 Math Olympiads 35:00 Experience based methods 42:56 Hannah's reflections 44:00 Fan Hui joins