Topics & Resources

Proseminar

The proseminar topics are largely based on the RL book by Sutton & Barto.

1. Multi-armed bandits & Markov Decision Processes (2.1, 3)

2. Policy Iteration (4.1, 4.2, 4.3)

3. Value Iteration (4.4)

4. Monte Carlo Methods (5.1, 5.2, 5.3) and Temporal Difference Methods (6.1, 6.2)

5. Q-learning and Sarsa (6.4 and 6.5)

The (bracketed notes) refer to specific sections in the book which should serve as a good starting point. You are more than welcome to use additional resources/materials for preparing your talk.

Seminar

The seminar topic selection aims to cover some of the most fundamental model-free and model-based deep reinforcement learning algorithms and give some insight into one additional key topic in deep reinforcement learning, namely methods for improving exploration. The basis of these topics are mostly the respective publications and sometimes (specific sections) from the RL book by Sutton & Barto.