Building and evaluating multivariate statistical machine learning methods for knowledge extraction from the UK Biobank
Principal Investigator:
Professor Thomas Nichols
Approved Research ID:
34077
Approval date:
October 24th 2018
Lay summary
Methods to allow joint analysis across the UK BioBanks's many imaging, genomic, environmental and clinical variables and remains challenging and underdeveloped. We will develop scalable multivariate statistical machine learning methods and software to extract useful features from all imaging UKBB different data modalities simultaneously to a) to predict different health outcomes from imaging and non-imaging, b) associate brain features with non-brain factors while controlling for individual differences in environmental and genomic data, and c) use UKBB data as a replacement for Monte Carlo simulations in the evaluation and benchmarking of new and existing analyses methods. Our work will assist scientist in extracting features from the multitude of UK Biobank variables, and finding relationships among these features, ultimately supporting the UK Biobank aims to improve the prevention, diagnosis of disease. We will develop methods and software that uses the shared information among different data modalities in the UKBB to extract features, ultimately building models to predict different health outcomes and associate brain related features with non-imaging variables. We will also use the UK biobank data to benchmark the performance of statistical methods that researchers use everyday.