
Humans augment learning from experience and demonstrations with learning from general instructions, rules, and descriptions of behaviour. PBB enables LLMs to do the same, unlocking a highly sample efficient form of learning. Excited to share this work in Brazil! 🇧🇷










