Sabitlenmiş Tweet

new preprint
"ReLU to the Rescue: Improve your On-policy Actor-Critic with Positive Advantages"
shockingly simple changes to A3C can give a cautious RL algorithm more effective than PPO
in some settings, just adding a ReLU is enough!
arxiv.org/abs/2306.01460

English






