Developing interpretable machine learning methods for genome-wide association studies
Approved Research ID: 70075
Approval date: January 25th 2022
Complex diseases are caused by a mix of genetic and environmental factors. In the last decade, and thanks to large-scale efforts like UK Biobank, researchers have identified many genetic factors of disease. Some of them are useful for disease prevention, others for treatment, and many allowed scientists to understand the disease better. However, most of the genetic factors are hard to interpret, which hinders our ability to make the most out of these studies.
In this 3-year long project, we will develop new methodologies to discovery more genetic factors and boost our understanding of diseases. We will do this, first, by developing methods that jointly analyse the patients' genetics and other sources of biological information. Our rationale is that, by studying the same disease from multiple perspectives, we will find common themes and mechanisms with a more straightforward interpretation. And second, we will explore new ways of approaching the data, making fewer assumptions about it, using so-called nonlinear relationships. While such assumptions are helpful when they match the underlying biology, they hinder discovery when they do not. Hence, by dropping them when possible, we hope to uncover new links between genetics and disease.