Skip to navigation Skip to main content Skip to footer

Approved research

Precise Polygenic Risk Score

Principal Investigator: Dr Nadav Rappoport
Approved Research ID: 56774
Approval date: April 1st 2020

Lay summary

A Mendelian trait is a trait like cystic fibrosis is controlled by a single locus and a mutation in a single gene contributes the whole effect (disease). In contrast, complex traits are diseases or phenotypes that have a heritable component that is not affected by a change in a single genomic locus (nucleotide or gene) but is affected by variations or mutations in multiple, and sometimes many genomic regions. Therefore, detecting genomic regions (loci) affecting the trait is challenging. State-of-the-art methods to estimate the effect of a genome on a trait are based on a simplifying assumption that genomic loci affect the trait independently from each other, and therefore summing the effect size of these loci will be a good estimator. Such methods were shown to lack the ability to estimate the whole genomic effect on the trait, known as the 'Missing heritability problem'. We aim to develop a more precise model to explain a trait based on genomic data by using agnostic information on genomic regions. For example, we can use previous knowledge on genes in proximity to each genomic locus like which genes are physiologically related to the trait of interest. The genome each cell is condensed using histone proteins in a structure known as chromatin. There are known chemical modification of histone proteins that affect the density of the region. We will use data about the histone's chemical modifications in genomic regions, as an indicator for open chromatin regions. We will develop a model to predict a trait in a Pathway-specific and Tissue-specific manner. Pahtway-speicific prediction is based on variations in genomic loci in proximity to genes participating in a biological pathway. Tissue-specific prediction is based on chromatin markers that indicate open chromatin regions in a tissue. Another association with tissue will be mediated by tissue-specific gene expression. Here, variations in genomic loci are associated with tissues for which a proximal gene is differentially expressed in. Our multi predictors will be combined for estimating the trait of interest and to cluster subjects into subgroups. The subgroups will be explored to identify sub-phenotypes, or diseases with the same appearance but with subgroup of genomic effect. Our proposed method will be tested on a variety of quantitative and binary phenotypes, traits and diseases. Quantitative traits like BMI, or cholesterol level, as well as binary ones like the diagnosis of specific diseases like hypertension or asthma.