
Shen Li
6 posts

Shen Li
@ShenLiRobot
Postdoc @MIT & @MIT_CSAIL | Human-Robot/AI Interaction | Learning from Human Feedback


Excited to present our #NeurIPS2024 Oral talk! 🚀 Enhancing Preference-based Linear Bandits via Human Response Time Coffee or tea? If you choose instantly, you likely have a strong preference. How can AI leverage this psychological insight to better learn human preferences? Curious? Don't think too long! Let's connect and explore how psychology drives smarter AI. 📅 Dec. 11, 3:30-3:50 PM PST 📍 Oral Session 2A: Agents (East Ballroom A, B) 👉 Conference Session neurips.cc/virtual/2024/p… 👉 Paper on arXiv arxiv.org/pdf/2409.05798


Excited to share our new work: Enhancing Preference-based Linear Bandits via Human Response Time ⏱️🤖 @edgeyyzhang, Zhaolin Ren, Prof. Na Li, @ClaireYLiang, Prof. @julie_a_shah 👉 arxiv.org/abs/2409.05798 We show that human response times provide information about human preference strength, and speed up preference learning. This complements existing bandit algorithms that only learn from binary choices. We demonstrate this by integrating a psychology model (the EZ-Diffusion Model) into a bandit algorithm. #AI #MachineLearning #RLHF #HumanFeedback #psychology #Bandits #Robotics #EZDiffusionModel

