Approved Research

Statistical methods development for dependent and non-sparse omics data

Florida State University

Lay summary

High throughput technologies enable simultaneous inference of complex high dimensional omics data. An acute problem is the multiple testing adjustment. Most existing literature assume that useful features are sparse and that features are independent. We aim to develop statistical methods for multiple testing procedures that incorporate dependent and non-sparsity structure inherent in many high dimensional omics data.

We will use the UK Biobank data to evaluate the performance of our methods and software tools that we will develop. Using newly developed statistical methods, we hope to find novel genes that are associated with disease outcome. Our proposed research will contribute to a growing literature of false discovery rate control with omics dataset, leading to principled and improved statistical methods for biological data. This project will last three years.