Dan Beechey

Explaining Decision Making in Reinforcement Learning

Project Summary

My project centres around increasing the transparency of reinforcement learning agents. I mostly work on two classes of methods:

  • Improving the intrinsic interpretability of an agent’s decision making through the learning of skill hierarchies: A large uninterpretable chain of actions can be hierarchically structured so that decision making can be explained at a lower temporal resolution, which reduces the cognitive load on a system’s user.

  • Producing post-hoc local and global explanations of an agent’s decisions through developing state feature attribution methods: An agent’s decision making can be understood by considering how some attribute of its observation has influenced its decision. The quality of these techniques being post-hoc means that they require no assumptions of the agent and therefore are at no detriment to the agent’s learning or performance.

Research Interests

I have a general interest in all things machine learning, with a particular focus on reinforcement learning.

Within reinforcement learning, I have taken a large interest in hierarchical reinforcement learning, and a more general interest in intrinsic motivation.

I am also interested in explainable artificial intelligence, concentrating some attention on supervised learning but with most emphasis on explainable reinforcement learning, including both computational methods and social interpretations.

Background

BSc Mathematics, University of Bath.

MSc Data Science, University of Bath.

Supervisors

Prof Özgür Şimşek

Prof Emma Carmel