James Proudfoot

Understanding Transfer Learning through the Calculation of Chemical Reaction Barriers

Project summary

This project aims to apply transfer learning to the prediction of chemical activation energies. Transfer learning (TL) is a method in machine learning (ML) that uses knowledge gained from the training of one model to guide and enhance the training of another. Activation energy (Ea) is the minimum energy necessary for a reaction to occur, and is related to the energy of a transition state (TS) that connects reactants to products. Finding transition states can be a challenging and slow process using traditional computational methods, therefore researchers have been looking at using ML to calculate activation energies directly from chemical descriptors of reactions.

I think that chemistry is a good test bed for TL because so much is known about the traditional computational methods in use, such as density functional theory (DFT). We aim to use transfer learning to improve the prediction of models trained on larger datasets of lower levels of theory by introducing a smaller dataset of higher levels of theory, and empirical measurements of activation energies. Additionally, it is hoped that TL may be used to apply a model trained on one type of chemical reaction to the prediction of barriers for a different but related reaction.

This research aligns itself with the aims of ART-AI (Accountable, Responsible and Transparent Artificial Intelligence). Primarily, this work will look to increase the transparency of TL models by improving their explainability (providing mathematical or graphical explanations of the underlying mechanism of transfer) and interpretability (supplying human-readable explanations of transfer learning in the contexts in which it is used). Additionally, it is hoped that we can make the use of TL more accountable by understanding the situations in which it should, and should not, be used. Transfer learning has the promise of accelerating machine learning while using less data and computational resources. Therefore a greater understanding of its underlying machinery may contribute to a more responsible use of such valuable resources.

Research Interests

Chemistry. Machine learning. Transfer learning. Neural networks. Gaussian Process Regression.

Background

BA/MSc Natural Sciences, University of Cambridge, UK.

Supervisors

Dr Matthew Grayson

Dr Pranav Singh

 

James Proudfoot