Novel Statistics Methods for analyzing UK Biobank data
The growing availability of electronic health record (EHR) linked biobanks brings up new research opportunities but also comes with new methodological challenges. This research aims to develop statistical methods to tackle some challenges including data quality, bias due to the common causes of treatment and outcome, and lack of transportability of models across data sources. For example, despite federal initiatives incentivizing EHR data harmonization across healthcare institutions, it is notorious that EHRs do not talk to each other. Such a lack of interoperability can decrease a model's performance and lead to biases in biomedical research. We will adopt principles of how humans talk to each other to address the inherent heterogeneity in multi-institutional EHR data and implement the proposed methods to generate and transfer knowledge between the Michigan Genomics Initiative and UK Biobank. We will also develop statistical methods to utilize secondary treatments and outcomes previously omitted in association analysis and causal inference to detect and reduce bias due to confounding variables.