Last updated:
ID:
46926
Start date:
12 March 2019
Project status:
Current
Principal investigator:
Dr Loren Buhle
Lead institution:
DNAnexus, Inc., United States of America

UK Biobank data is an invaluable resource for developing machine learning methods for improving human health. While there are already many successful applications on risk assessments and disease prediction, some of the algorithms used are like a black box. We plan to apply computation techniques to “open up the black boxes” by perturbing the features used for prediction systematically and measure how the prediction power responses to the perturbations (e.g., SHAP method https://arxiv.org/abs/1705.07874, and DeepExplain https://github.com/marcoancona/DeepExplain). It will provide information on understanding how a successful machine learning algorithm makes its decision. By revealing such secrets, it will help to understand the biological causes and relationships between different factors and also help to improve the assessment or prediction algorithm in the future.