
Most robotics RL paper is often just imitation learning in disguise. The "human expert" transfer task through extensive reward shaping, curricula, initialization strategies, environment design, and various tricks. You are providing demonstrations--just indirectly. A reward function is just a demonstration written in a different language.














