Matt Hewitt

Hierarchical Reinforcement Learning for Transparency in AI

Project Summary

To the average human, picking up an item is a trivial action. However, this simple function consists of numerous complex sub-actions that can be represented as a hierarchy of smaller tasks. The brain abstracts these sub-actions, such that a human only considers the high-level skill with the lower level actions being performed subconsciously. Currently, reinforcement learning (RL) algorithms focus on mapping a state observation to a corresponding set of low-level actions. This creates a lack of transparency in the way in which conventional RL agents operate when applied to large scale problems. This means that they also struggle with temporal abstraction, suffering from a lack of generality and only working in the environments in which they were trained, struggling with even the smallest modifications. These are all problems that the human brain has developed to overcome and are things that advancements in RL should focus on perfecting. Hierarchical reinforcement learning (HRL) offers a solution by creating a hierarchy of smaller and smaller sub-skills to create high-level behaviours that are temporally abstract. This increases the transparency of the agents that are produced, representing the actions taken as a set of high-level skills that can be easily understood by humans. In addition having algorithms create these hierarchies autonomously, should result in the development of domain independent skills. As such my research will focus on new methods of autonomous skill discovery in hierarchical reinforcement learning.

Research Interests

Reinforcement Learning with an emphasis on hierarchical methods.

Deep Learning.

Robotics and Automation.


BSc Computer Science, University of Bath

MSc Data Science, University of Bath


Prof Özgür Şimşek

Dr Uriel Martinez-Hernandez