Skip to navigation Skip to main content Skip to footer

Approved Research

Development of statistical methods to identify and characterize genetic variants linked to complex diseases under the time-to-event outcome modeling framework.

Principal Investigator: Dr Ryan Sun
Approved Research ID: 73569
Approval date: April 26th 2022

Lay summary

The overarching goal of our project is to develop a suite of novel, robust tools capable of investigating genetic causes of diseases in a modeling framework that explicitly accounts for the time of disease onset.

Genetic risk factors possess a significant role in the development of many diseases, for instance, most cancers. Thus, numerous statistical methods have been developed to help identify and explain how genetic variants contribute to disease processes. However, despite the vast amounts of effort expended in these studies, researchers are still generally unable to provide comprehensive and precise explanations of all the genetic effects influencing any given trait, except for a small group of simple diseases. Thus, novel tools to interrogate genetic data are still needed. 

The vast majority of statistical tools developed for the aforementioned investigations have been constructed for a framework that models continuous and binary outcomes. For example, there are numerous models available to associate genetic variants with height or absence/presence of cancer. Relatively fewer methods are available for explicitly modeling the onset time of a disease. However, it is intuitively clear that using disease time information can provide additional information above and beyond a binary indicator of whether the disease occurred. Such modeling is the focus of this proposal.

The first class of models that we propose will associate genetic variants with sets of correlated disease outcome times. By analyzing sets of similar outcomes together instead of investigating them individually, we can integrate a much larger amount of information and potentially detect more variants associated with disease. The second class of models we propose are tools that can better separate associated variants into a group of causal variants and a group of non-causal variants. Because association does not guarantee causation, this step is necessary to refine the list of associated variants that we compile. Identifying true causal variants improves the success rate of translational follow-up research such as therapeutic studies. Finally, a third class of models we will investigate are risk prediction models, which will enhance our ability to perform early intervention across many diseases.

In conclusion, our aims will broadly leverage the understudied time-to-event framework to help us better understand the genetic etiology of complex diseases. We expect this work to take approximately three years