Matthew Farrugia-Roberts

22 posts

Matthew Farrugia-Roberts banner
Matthew Farrugia-Roberts

Matthew Farrugia-Roberts

@MatthewFdashR

Grad student trying to understand the history of humanity, the future of AI, and how to make both of these things work together in the present.

Oxford, UK Katılım Haziran 2025
6 Takip Edilen41 Takipçiler
Matthew Farrugia-Roberts
Matthew Farrugia-Roberts@MatthewFdashR·
Have you ever tried to look inside the run folders that W&B makes for every deep learning experiment? Here's a deep dive about how I spent a few weeks freeing 118 GB of experimental archives from undocumented and corrupted binary .wandb files: far.in.net/free-wandb
English
0
0
2
41
Matthew Farrugia-Roberts
Matthew Farrugia-Roberts@MatthewFdashR·
I'm thrilled to be a part of delivering the first course on AI Safety and Alignment at the University of Oxford! Next week is going to be intense and I'm looking forward to it!
Fazl Barez@FazlBarez

🚨New AI Safety Course @aims_oxford! I’m thrilled to launch a new called AI Safety & Alignment (AISAA) course on the foundations & frontier research of making advanced AI systems safe and aligned at @UniofOxford what to expect 👇 robots.ox.ac.uk/~fazl/aisaa/

English
0
0
3
165
Matthew Farrugia-Roberts
Matthew Farrugia-Roberts@MatthewFdashR·
@benjaminsmayhew @Karim_abdelll Not exactly sure what you mean by 'shape of coherence' and 'invoked structure'. Possibly relevant, our problem setting (§4) can be straightforwardly adapted to work with true/proxy versions of any MDP component(s), it doesn't necessarily have to be the reward that differs.
English
1
0
0
29
Karim Abdel Sadek
Karim Abdel Sadek@Karim_abdelll·
*New AI Alignment Paper* 🚨 Goal misgeneralization occurs when AI agents learn the wrong reward function, instead of the human's intended goal. 😇 We show that training with a minimax regret objective provably mitigates it, promoting safer and better-aligned RL policies!
GIF
English
9
31
146
19.4K
Matthew Farrugia-Roberts
Matthew Farrugia-Roberts@MatthewFdashR·
@jesse_hoogland Goal misgeneralisation remains an important risk model for future advanced AI systems. We should continue to research how neural networks choose between different solutions and leverage that understanding into methods of avoiding unintended and dangerous solutions in the future.
English
0
0
8
132
Matthew Farrugia-Roberts
Matthew Farrugia-Roberts@MatthewFdashR·
@jesse_hoogland For more complex environments, we still need better UED methods. But UED is young! There are plenty of plausible directions for improving over the methods that have been proposed so far. The question is, is there enough room for improvement for this to help when it counts?
English
1
0
7
155
Matthew Farrugia-Roberts
Matthew Farrugia-Roberts@MatthewFdashR·
At least for me, the big-picture motivation behind our RLC paper is a research vision for scalable AI alignment via minimax regret autocurricula. Learn about the paper via co-author @Karim_abdelll: 🧵👉x.com/Karim_abdelll/… Learn about why I think this is important work 🧵👇
Karim Abdel Sadek@Karim_abdelll

*New AI Alignment Paper* 🚨 Goal misgeneralization occurs when AI agents learn the wrong reward function, instead of the human's intended goal. 😇 We show that training with a minimax regret objective provably mitigates it, promoting safer and better-aligned RL policies!

English
2
11
30
3.9K
Matthew Farrugia-Roberts
Matthew Farrugia-Roberts@MatthewFdashR·
Accordingly, last year, I was invited to give a guest lecture on ethical questions raised by potential future advancements in AI for the final week of @UniMelb's COMP90087 The Ethics of Artificial Intelligence. youtu.be/-DvCQAiX2QA
YouTube video
YouTube
English
0
0
1
136
Matthew Farrugia-Roberts
Matthew Farrugia-Roberts@MatthewFdashR·
There are many important social and ethical issues raised by today’s AI technologies. It's also true that as we project developments in AI technology into the future, we can foresee new and different ethical issues that might arise.
English
1
0
1
103