Last updated:
Author(s):
S Li, M Sesia, Y Romano, E Candès, C Sabatti
Publish date:
2 November 2021
Journal:
Biometrika
PubMed ID:
38633763

Abstract

This paper develops a method based on model-X knockoffs to find conditional associations that are consistent across environments, controlling the false discovery rate. The motivation for this problem is that large data sets may contain numerous associations that are statistically significant and yet misleading, as they are induced by confounders or sampling imperfections. However, associations replicated under different conditions may be more interesting. In fact, consistency sometimes provably leads to valid causal inferences even if conditional associations do not. While the proposed method is widely applicable, this paper highlights its relevance to genome-wide association studies, in which robustness across populations with diverse ancestries mitigates confounding due to unmeasured variants. The effectiveness of this approach is demonstrated by simulations and applications to the UK Biobank data.

Related projects

Our goal is to develop new data analysis methods that are well suited to discover the many genetic signals that influence traits of medical relevance.

Institution:
Stanford University, United States of America

All projects