Research Questions:
What are the non-linear effects of genetic variants on phenotype risk that traditional linear models miss?
Can causal inference models effectively prioritize disease-associated variants for therapeutic targeting?
How can variant interaction graphs improve phenotype prediction accuracy?
Can we establish mechanistic links between genetic predictions and molecular function?
Objectives:
This project aims to develop advanced computational approaches for genome-wide association studies (GWAS) that capture complex, non-linear relationships between genetic variants and disease phenotypes. We will leverage recent advances in GPU computing to train large-scale models that surpass traditional linear regression approaches.
Scientific Rationale:
Traditional GWAS relies on linear regression models that may overlook complex genetic interactions and non-linear effects critical to disease etiology. By employing three complementary computational approaches-machine learning, causal inference, and graph-based AI-we will identify novel disease mechanisms currently hidden in genomic data.
Our methodology will integrate:
Deep learning models to capture non-linear variant-phenotype relationships
Causal inference frameworks to distinguish correlation from causation in variant prioritization
Graph neural networks to model variant interaction networks and their collective impact on phenotypes
AI foundation models to bridge the variant-to-function gap, connecting genetic findings to molecular mechanisms
Expected Impact:
This research will uncover previously undetected genetic associations, reveal novel therapeutic targets, and accelerate drug repurposing efforts by providing mechanistic insights into variant function. The resulting models and insights will advance precision medicine and drug discovery pipelines.