Akshil Patel

Improving Control with Intrinsically Motivated Reinforcement Learning

Project Summary

This project will address the problem of principled exploration for artificial agents based on how humans are motivated. For example, humans pay attention to novelty or surprising events in the world. We can implement these sources of motivation in the artificial learning setting. Consider an autonomous driving agent during training, which has not yet driven through a roundabout. To drive safely and effectively, the agent may benefit from practising roundabouts during training. Seeking novelty would encourage the agent to practise driving around roundabouts. The resultant behaviour would prioritise the agent’s exploration based on its current abilities.

Motivational factors such as novelty or surprise are general concepts that are present throughout real use cases. This highlights how widely we can use intrinsic motivation to improve exploration. Seeking novelty, however, does not necessarily improve the agent’s ability to complete tasks. For example, the colour of the buildings seen by the agent is not relevant for driving. Therefore, aiming to find buildings with novel colours is a waste of resources during training.

Assuring that exploratory behaviour is helpful for the agent’s current and future tasks is a common challenge with applying intrinsic motivation to reinforcement learning. We will address this challenge in the project by producing algorithms that guide exploration to directly improve the agent’s ability to complete tasks over its lifetime.

Research Interests

Reinforcement learning.

Intrinsic motivation.

Behaviour hierarchies.


Machine learning.


MSc Machine Learning and Autonomous Systems – University of Bath (2018-2019).

BSc Mathematical Sciences – University of Bath (2015-2018).

Software Validation Engineer – Mango Solutions (2017-2019).

Co-Organiser – BathML meetup (2019-2020).

ART-AI ED&I board member (2019-).


Dr Özgür Şimşek

Dr Iulia Cioroianu