Topics & Resources
Proseminar
The proseminar topics are largely based on the RL book by Sutton & Barto.
1. Multi-armed bandits & Markov Decision Processes (2.1, 3)
2. Policy Iteration (4.1, 4.2, 4.3)
3. Value Iteration (4.4)
4. Monte Carlo Methods (5.1, 5.2, 5.3) and Temporal Difference Methods (6.1, 6.2)
5. Q-learning and Sarsa (6.4 and 6.5)
The (bracketed notes) refer to specific sections in the book which should serve as a good starting point. You are more than welcome to use additional resources/materials for preparing your talk.
Seminar
The seminar topic selection aims to cover some of the most fundamental model-free and model-based deep reinforcement learning algorithms and give some insight into one additional key topic in deep reinforcement learning, namely methods for improving exploration. The basis of these topics are mostly the respective publications and sometimes (specific sections) from the RL book by Sutton & Barto.
Transitioning to deep RL
6. Function Approximation (9.1, 9.2, 9.3)
Model-free DRL algorithms
8. DQN improvements (pick 2 out of these 3): Double DQN (useful: RL book 6.7), Dueling DQN, Prioritized Experience Replay
9. Policy Gradient Methods (13.1, 13.2, 13.3), Advantage Actor Critics (useful, but optional: control variates)
10. Proximal Policy Optimization
Model-based DRL algorithms
11. Model-based RL and Monte Carlo Tree Search (8.1, 8.10, 8.11)
12. AlphaGo Zero
13. PlaNet
14. Dreamer (v2)
Exploration
15. Random Network Distillation
16. Hindsight Experience Replay
17. Adversarially Motivated Intrinsic Goals
18. Adversarially Guided Actor Critic
19. Go-Explore