
Bryan Chan
50 posts

Bryan Chan
@chanpyb
PhD student @rlai_lab. Prev: @GoogleDeepMind, @OcadoTechnology, @kindredai, @UofTCompSci









Excited to share that our work on understanding when ICL emerges has been accepted to #ICLR2025 ! Submission for preview: openreview.net/forum?id=aKJr5…


LLMs can leverage context information, i.e., in-context learning (ICL) or memorize solutions, i.e., in-weight learning (IWL) for prediction, but when do they happen? 1/N

Here's a fascinating paper by @domo_mr_roboto's group linking hierarchical reinforcement learning and cheaply-obtainable auxiliary tasks arxiv.org/abs/2407.03311 Better exploration with minimal engineering effort remains a critical challenge (even for RLHF/AIF) - reminiscent of our efforts on SAC-X and intrinsic rewards through representation learning (VAE, Transporter, etc.) arxiv.org/abs/2011.01758 Excited to see more progress in this space! #robotics #reinforcementlearning

















Hey, @iclr_conf . Standard policy for experimenting would be to ask for consent from participants and explain the setup (e.g. what systems are being used exactly) thoroughly. I don't think we should be legitimizing the use of LLMs in the review process. blog.iclr.cc/2024/10/09/icl…



