~~NOTOC~~
====== Seminar ======
**Hierarchical Reinforcement Learning**\\
//Dr. Joschka Boedecker//

  * **Seminar:**
    * HISinOne Number: 11LE13S-7308

  * **Time/Venue:**
    * This will be a block seminar held shortly after lectures have ended. A first meeting for planning purposes will take place on **Monday, June 13th, 4:00 PM – 5:00 PM** in room SR 00 019 (kitchen/seminar room on the upper floor) of building 079.

  * **Language:**
    * English

===== Overview: =====
We'll read and discuss a number of papers from the area of hierarchical reinforcement learning. This will include papers on early approaches, new developments, and papers to give context and background information.

===== Papers: =====
| [[http://www.cs.berkeley.edu/~russell/classes/cs294/f05/papers/parr+russell-1998.pdf|1. Reinforcement Learning with Hierarchies Of Machines]] | 
| [[http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.9.313&rep=rep1&type=pdf|2. The MAXQ Method for Hierarchical Reinforcement Learning]] | 
| [[https://webdocs.cs.ualberta.ca/~sutton/papers/SPS-aij.pdf|3. Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning]] | 
| [[http://people.eecs.berkeley.edu/~russell/classes/cs294/f05/papers/hengst-2002.pdf|4. Discovering Hierarchy in Reinforcement Learning with HEXQ]] | 
| [[http://neuro.bstu.by/ai/Win-old/Conf-4/ICNNAI-2010/Submission-2/Ref-double/stolle2002learning.pdf|5. Learning Options in Reinforcement Learning]] | 
| [[http://www.marcpickett.com/papers/pickettICML2002.pdf|6. PolicyBlocks: An Algorithm for Creating Useful Macro-Actions in Reinforcement Learning]] | 
| [[ftp://ftp.idsia.ch/pub/juergen/bakker_HRL_IAS2004.pdf|7. Hierarchical Reinforcement Learning Based on Subgoal Discovery and Subpolicy Specialization]] | 
| [[http://www.sciencedirect.com/science/article/pii/S0921889001001130|8. Acquisition of stand-up behavior by a real robot using hierarchical reinforcement learning]] | 
| [[http://th.physik.uni-frankfurt.de/~rothkopf/docs/ToutounjiRothkopfTriesch2011.pdf|9. Scalable Reinforcement Learning through Hierarchical Decompositions for Weakly-Coupled Problems]] | 
| [[https://is.tuebingen.mpg.de/fileadmin/user_upload/files/publications/2012/AISTATS-2012-Daniel.pdf|10. Hierarchical Relative Entropy Policy Search]] | 
| [[http://pierrelucbacon.com/optioncritic-13oct2015.pdf|11. The Option-Critic Architecture]] | 
| [[http://arxiv.org/pdf/1604.06057v1.pdf|12. Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation]] | 

===== Format: =====
We'll have presentations of 45 min (30 min talk + 15 min for Q&A) for each paper. In addition, students will prepare a 3-page summary of the paper they presented.