Michael Beukman

61 posts

Michael Beukman

Michael Beukman

@mcbeukman

PhD Student at @FLAIR_Ox, previously at @raillabwits. Ex SR @GoogleDeepMind Interested in Open ended learning, scaling RL and continual learning.

Oxford, England Katılım Temmuz 2022
190 Takip Edilen534 Takipçiler
Sabitlenmiş Tweet
Michael Beukman
Michael Beukman@mcbeukman·
1/ As compute continues to grow and simulators continue to improve, it is becoming feasible to train RL agents for billions or trillions of timesteps. However, this is only useful if agents can continue learning over such long training horizons, which is far from given 👇
Michael Beukman tweet media
English
5
43
325
85.2K
Michael Beukman retweetledi
Alex Goldie
Alex Goldie@AlexDGoldie·
1/ 🪩 Automating the discovery of new algorithms could unlock significant breakthroughs in ML research. But optimising agents for this research has been limited by too few tasks to learn from! Introducing DiscoGen, a procedural generator of algorithm discovery tasks 🧵
Alex Goldie tweet media
English
3
41
146
35.6K
Michael Beukman
Michael Beukman@mcbeukman·
10/ Armed with these insights, we turn to the Kinetix benchmark—an open-ended universe of 2D physics based tasks. We scale to 1 million parallel environments, and show that this leads to monotonic performance improvement for more than 1 trillion timesteps without stagnating 🚀
Michael Beukman tweet media
English
2
3
23
3.1K
Michael Beukman
Michael Beukman@mcbeukman·
1/ As compute continues to grow and simulators continue to improve, it is becoming feasible to train RL agents for billions or trillions of timesteps. However, this is only useful if agents can continue learning over such long training horizons, which is far from given 👇
Michael Beukman tweet media
English
5
43
325
85.2K
Michael Beukman retweetledi
nathan monette
nathan monette@nathanrmonette·
I've compiled some notes on Unsupervised Environment Design (UED): nmonette.github.io/assets/ued_not… Please don't hesitate to reach out if interested in talking more about UED :)
English
1
3
27
2.1K
Michael Beukman retweetledi
nathan monette
nathan monette@nathanrmonette·
Was amazing to present my first paper at @RL_Conference !! Really awesome to meet new folks from the community :)
nathan monette tweet medianathan monette tweet media
English
2
7
64
6K
Michael Beukman retweetledi
Clarisse Wibault
Clarisse Wibault@ClarisseWibault·
How can we bypass the need for online hyper-parameter tuning in offline RL? @FLAIR_Ox is introducing two fully offline algorithms: SOReL, for accurate offline regret approximation, and TOReL, for offline hyper-parameter tuning! arxiv.org/html/2505.2244…
English
1
9
26
4.6K
Michael Beukman retweetledi
nathan monette
nathan monette@nathanrmonette·
Excited to announce my first paper, with @j_foerst and @FLAIR_Ox, was accepted into @rl_conference 2025! We establish a new UED method called NCC that obtains strong performance based on principles of optimisation theory.
nathan monette tweet media
English
1
9
73
12.6K
Michael Beukman retweetledi
Matthew Jackson
Matthew Jackson@JacksonMattT·
🌹 Today we're releasing Unifloral, our new library for Offline Reinforcement Learning! We make research easy: ⚛️ Single-file 🤏 Minimal ⚡️ End-to-end Jax Best of all, we unify prior methods into one algorithm - a single hyperparameter space for research! ⤵️
English
4
37
172
94.5K