Lior Shani

59 posts

Lior Shani banner
Lior Shani

Lior Shani

@LiorShan

Currently at Google Research. PhD in Reinforcement Learning from the Technion. Main research interests include Reinforcement Learning and Large Language Models

Katılım Kasım 2019
128 Takip Edilen140 Takipçiler
Lior Shani retweetledi
Ofir Nabati
Ofir Nabati@ofirnabati·
We're excited to share our new paper: "Personalized and Sequential Text-to-Image Generation"! Check out the paper and our new sequential human rater dataset! 👇 Paper: arxiv.org/pdf/2412.10419 Dataset: kaggle.com/datasets/googl… Details below.. 1/N 🧵
English
1
1
6
1.3K
Lior Shani
Lior Shani@LiorShan·
Experimental results showing MTPO outperforms single-turn RLHF baselines and a multi-turn generalization of RLHF. This demonstrates the effectiveness of our approach in improving the quality of multi-turn conversations. (7/8)
Lior Shani tweet media
English
1
0
2
152
Lior Shani
Lior Shani@LiorShan·
Finally, we discuss the need for exploration in imitation learning, and argue that our Apprenticeship Learning based approach which relies on the MDP structure, is superior to supervised learning approaches such as BC.
English
0
0
2
0
Lior Shani
Lior Shani@LiorShan·
We show our approach is both theoretically efficient and practical: we provide regret guarantees and show how to avoid solving an MDP at each iteration as in prior works. This allows us to devise a well-performing deep RL implementation of our (OAL) algorithm.
English
1
0
0
0
Lior Shani
Lior Shani@LiorShan·
Glad to present our paper "Online Apprenticeship Learning" at #AAAI2022 arxiv.org/abs/2102.06924 with @TZahavy @MannorShie We show how to efficiently reproduce experts' behavior from an offline data of trajectories, by interacting with the MDP (when rewards are not specified).
English
2
1
16
0
Lior Shani retweetledi
Guy Tennenholtz
Guy Tennenholtz@guytenn·
teleporting, swimming with sharks, and lots more. It turned out better than I've ever expected! Now, before releasing it out to the world, I'm looking for a partner that can help me market the game properly. You can help me out by retweeting! Promo:
English
0
8
6
0
Lior Shani retweetledi
Ludwig Cancer
Ludwig Cancer@Ludwig_Cancer·
Ludwig @Princeton Director Joshua Rabinowitz, a pioneer of metabolomics, has contributed to the development of a cancer therapy & undone enduring assumptions about metabolism. His work is opening new approaches to cancer therapy. Learn more: bit.ly/3Ex0ndx
Ludwig Cancer tweet media
English
2
13
43
0
Manan Tomar
Manan Tomar@manan_tomar·
Overall, MDPO is an easily scalable policy optimization algorithm with minimal hyper-params/heuristics involved, and is nicely grounded in mirror descent theory :) Joint work with @LiorShan, Yonathan Efroni, Mohammad Ghavamzadeh Come chat on Dec 11, 11:30 am PST!
English
1
0
4
0
Lior Shani
Lior Shani@LiorShan·
Prof. Shie Mannor is presenting our work at the great RL theory seminar this Tuesday! The talk will be about the connections between TRPO and convex optimization, possible practical implications and how to explore in policy optimization...
RL Theory Virtual Seminars@RLtheory

Our next talk: 06/09: Shie Mannor (Technion) "Adaptive Trust Region Policy Optimization: Global Convergence and Faster Rates for Regularized MDPs" For details, please see the website: sites.google.com/view/rltheorys…

English
0
1
13
0