Exploration in RL

Self-Adaptive Imitation Learning: Learning Tasks with Delayed Rewards from Sub-Optimal Demonstrations.

Reinforcement learning (RL) has demonstrated its superiority in solving sequential decision-making problems. However, heavy dependence on immediate reward feedback impedes the wide application of RL. On the other hand, imitation learning (IL) tackles …