Last updated:
ID:
256056
Start date:
11 November 2024
Project status:
Current
Principal investigator:
Professor Ming Zhang
Lead institution:
Tongji University, China

Understanding the human genome is pivotal for personalized medicine and improving disease treatment. But the functional roles of noncoding variants and structural variants in our DNA are still largely unclear. In addition, the link of spatially specific genes and diseases is largely unknown. Understanding these functional variants and region-specific genes could hold the key for smart diagnosing and treating diseases.

Deep learning is a type of artificial intelligence that mimics the way humans learn. Some deep learning models have been designed to predict the functional roles of genetic variants, such as pre-mRNA splicing and gene expression. Despite their potential, current models often rely on a narrow selection of genetic data for training, severely undercutting the rich diversity within human genetics.

Our current research proposal leverages the UK Biobank’s genetic and health data, including whole-genome sequencing (WGS) data from around 500,000 individuals. This unique resource offers us an opportunity to develop novel deep learning models that can better predict the functional roles of genetic variants, especially those in the noncoding and structural regions of the genome. Moreover, the resource would allow us to identify the link between region specific genes and disease risk.

In a duration of 3 years, our research aims to develop several deep learning models that can predict how genetic variations affect RNA splicing, how genetic variants affect RNA alternative polyadenylation sites, and how genetic variants affect DNA methylation status. In addition ,our proposal would pinpoint the roles of region specific genes in the etiology of human diseases.

By predicting these functional effects, we hope to unravel how noncoding and structural variants affect key biological processes and establish their links to complex human diseases (such as neurodegenerative disease, cardiovascular diseases and cancer). Our methods involve using the UK Biobank’s genetic sequences to train these models, ensuring they can handle the diversity and complexity of human genetics more effectively.

In navigating the intricate roles of our genetic code, we could unlock precise treatments and diagnostics. In addition, we aim to provide novel insights into genetic mechanisms of noncoding variants and structural variants underlying human diseases, as well as the roles of region specific genes underlying human diseases. By understanding the links of functional/regional genetic elements and biomarkers, we could significantly enhance our ability to predict, prevent, and treat various diseases, tailoring our approaches to personalized medicine.