
Q: How can pretrained vision-language models be used as zero-shot reward models in reinforcement learning?
A: Pretrained vision-language models can serve as zero-shot reward models by specifying tasks using natural language. arxiv.org/abs/2310.12921
English