
Lior Shani
59 posts

Lior Shani
@LiorShan
Currently at Google Research. PhD in Reinforcement Learning from the Technion. Main research interests include Reinforcement Learning and Large Language Models









Overall, MDPO is an easily scalable policy optimization algorithm with minimal hyper-params/heuristics involved, and is nicely grounded in mirror descent theory :) Joint work with @LiorShan, Yonathan Efroni, Mohammad Ghavamzadeh Come chat on Dec 11, 11:30 am PST!

Overall, MDPO is an easily scalable policy optimization algorithm with minimal hyper-params/heuristics involved, and is nicely grounded in mirror descent theory :) Joint work with @LiorShan, Yonathan Efroni, Mohammad Ghavamzadeh Come chat on Dec 11, 11:30 am PST!



Our next talk: 06/09: Shie Mannor (Technion) "Adaptive Trust Region Policy Optimization: Global Convergence and Faster Rates for Regularized MDPs" For details, please see the website: sites.google.com/view/rltheorys…

our seminars are back this week with a real black-belt RL theorist!!
