Lucas Alegre

207 posts

Lucas Alegre

@lnalegre

Professor at @INF_UFRGS. Interested in multi-task and multi-objective reinforcement learning.

Porto Alegre, Brasil Katılım Eylül 2019

453 Takip Edilen175 Takipçiler

Sabitlenmiş Tweet

Lucas Alegre@lnalegre·18 Şub

I am happy to announce that I successfully defended my PhD, entitled “Sample-Efficient Multi-Task and Multi-Objective Reinforcement Learning by Combining Multiple Behaviors”! 🎉 I am very lucky to have collaborated with and met so many great people during this PhD journey 😄

English

1.1K

Lucas Alegre retweetledi

Bo Wang@BoWang87·18 Şub

🚨 BREAKING: Brazil just gave away a breakthrough invention because it couldn't pay a patent fee. Dr. Tatiana Sampaio drops a bombshell: The "Polilaminina" — a revolutionary material developed at Brazil's top federal university — is now anyone's to claim. Why? Budget cuts in 2015-2016 left UFRJ too broke to renew the international patent. Translation: Years of research. Millions in public investment. Gone. Because the government couldn't spare the maintenance fees. Now foreign companies can patent it themselves — and Brazil will pay them to use what it invented. This is what "austerity" actually costs.

Bo Wang@BoWang87

A Brazilian scientist worked in silence for 25 years on something medicine said was impossible: regenerating the spinal cord. Dr. Tatiana Sampaio extracted a protein from placentas that acts as "biological glue" — recreating the conditions that let embryonic neurons connect. Six patients with complete spinal cord injuries regained movement. Bruno Drummond was tetraplegic after a car accident. Two weeks after treatment, he moved his toe. Today he walks, climbs stairs, dances. Her quote when asked why she finally went public: "I no longer have the right to be conservative." 25 years. No social media. No self-promotion. Just the work. This is what real science looks like.

English

368

2.4K

13.5K

2.2M

Lucas Alegre@lnalegre·5 Ara

It was very fun to present our paper on "Constructing an Optimal Behaviour Basis for the Option Keyboard" this week at @NeurIPSConf! Paper: openreview.net/pdf?id=D4gOopm…

English

Lucas Alegre@lnalegre·1 Ara

@apsarathchandar @NeurIPSConf It is working for me

English

Sarath Chandar@apsarathchandar·1 Ara

@lnalegre @NeurIPSConf Is it even working?

English

122

Sarath Chandar@apsarathchandar·1 Ara

Is there no Whova app for @NeurIPSConf this year?

English

1.3K

Lucas Alegre retweetledi

Khurram Javed@kjaved_·18 Eki

The Dwarkesh/Andrej interview is worth watching. Like many others in the field, my introduction to deep learning was Andrej’s CS231n. In this era when many are involved in wishful thinking driven by simple pattern matching (e.g., extrapolating scaling laws without nuance), it’s refreshing to hear an influential voice that is tethered to reality. One clarification for the podcast is that when Andrej says humans don’t use reinforcement learning, he is really saying humans don't use returns as learning targets. His example of LLMs struggling to learn to solve math problems from outcome-based rewards also elucidates the problem with learning directly from returns. Fortunately for RL, this exact problem is solved by temporal difference (TD) learning. All sample-efficient RL algorithms that show human-like learning (e.g., sample-efficient learning on Atari, and our work on learning from experience directly on a robot) rely on TD learning. Now Andrej is not primarily an RL person; he is looking at RL through the lens of LLMs these days, and all RL done in LLMs uses returns as targets, so it’s understandable that he is assuming that RL is all about learning from observed returns. But this assumption leads him to the incorrect conclusion that we need process-based dense rewards for RL to work. If you embrace TD learning, then you don't necessarily need a dense reward. Once you have learned a value function that encodes useful knowledge about the world, you can learn on the fly in the absence of rewards, just like humans and animals. This is possible because in TD learning there is no difference between learning from an unexpected reward and learning from an unexpected change in perceived value.

Dwarkesh Patel@dwarkesh_sp

The @karpathy interview 0:00:00 – AGI is still a decade away 0:30:33 – LLM cognitive deficits 0:40:53 – RL is terrible 0:50:26 – How do humans learn? 1:07:13 – AGI will blend into 2% GDP growth 1:18:24 – ASI 1:33:38 – Evolution of intelligence & culture 1:43:43 - Why self driving took so long 1:57:08 - Future of education Look up Dwarkesh Podcast on YouTube, Apple Podcasts, Spotify, etc. Enjoy!

English

450

196.2K

Lucas Alegre retweetledi

ICLR 2026@iclr_conf·6 Ağu

ICLR 2026 will take place in 📍Rio de Janeiro, Brazil 📅 April 23–27, 2026 Save the date - see you in Rio! #ICLR2026

English

101

1.4K

93.3K

Lucas Alegre@lnalegre·20 Haz

Finally, reporting only IQM may compromise scientific transparency and fairness, as it can mask poor or unstable performance. @agarwl_ et al., who introduced IQM in this context, recommend using it in conjunction with other statistics rather than as a standalone measure.

English

251

Lucas Alegre@lnalegre·20 Haz

Yes, Interquartile Mean (IQM) is a robust statistic that reduces the influence of outliers. But it does not by itself provide a fair analysis of performance. IQM does not capture the full distribution of returns and may hide important information about variability and risk.

English

269

Lucas Alegre@lnalegre·20 Haz

While I really like the paper "Deep Reinforcement Learning at the Edge of the Statistical Precipice", I have seen papers evaluating performance using only the IQM metric and claiming that it is a fairer metric than the mean based on this paper, which is simply wrong.

English

784

Lucas Alegre@lnalegre·2 Haz

Paper Page: la.disneyresearch.com/publication/am…

English

181

Lucas Alegre@lnalegre·2 Haz

Check out AMOR now on arXiv: Paper: arxiv.org/abs/2505.23708 Full Video: youtube.com/watch?v=gQidYj… #SIGGRAPH2025 #RL #robotics

YouTube

English

253

Lucas Alegre@lnalegre·2 Haz

Annoyed by having to retrain your entire policy just because your reward weights did not quite work on the real robot? 🤖

English

253

Lucas Alegre retweetledi

Association for Computing Machinery@TheOfficialACM·5 Mar

Meet the recipients of the 2024 ACM A.M. Turing Award, Andrew G. Barto and Richard S. Sutton! They are recognized for developing the conceptual and algorithmic foundations of reinforcement learning. Please join us in congratulating the two recipients! bit.ly/4hpdsbD

English

457

1.5K

454.3K

Lucas Alegre@lnalegre·18 Şub

Finally, I would like to thank my advisors, Prof. Ana Bazzan and Prof. Bruno C. da Silva, and Prof. Ann Nowé who received me at VUB for my PhD stay. I am very grateful to everyone with that I had the chance to collaborate in all such amazing projects!💙

English

216

Lucas Alegre@lnalegre·18 Şub

I believe all these contributions open room for many interesting ideas for multi-policy RL methods. Especially in transfer learning (SFs&GPI) and multi-objective RL settings! 🚀

English

225

Lucas Alegre@lnalegre·18 Şub

English

1.1K

Keşfet

@NeurIPSConf @apsarathchandar @agarwl_ @elonmusk @BarackObama @taylorswift13 @cristiano @BillGates