Identify structural, psychological/psychometric, and physiological alterations associated with combinations of genetic risk variants for schizophrenia
The primary aim of this study is to identify structural, psychological/psychometric, and physiological alterations associated with combinations of genetic risk variants for schizophrenia (SCZ). The secondary aim of this study is to pinpoint environmental factors that modulate the effects of these genetic variants. We propose to leverage the genetic and phenotypic data available in the UK biobank to characterize phenotypic, potentially predisposing, consequences of subsets of SCZ associated genetic variants independent of SCZ status. This analysis will provide insights into the underlying mechanisms and components of pathophysiology underlying SCZ emergence and progression.
In order to achieve this aim, we require additional fields on the ICD diagnoses to determine clinically reported psychiatric illnesses (not only self-reports) as well as information on comorbidities with other diseases. Thus, we request the ICD diagnoses fields.
In addition, we plan to validate the credibility of our analysis approach on a second disease class. To that end, it is necessary to apply an identical analytical strategy to another, more clearly defined phenotypes. Thus, we propose to apply the exactly same methodology and analysis to the clearly defined phenotype coronary artery disease (CAD), which has a well described strong genetic contribution. The latter makes it ideal to benchmark our new analysis method and provide additional support for the validity of the results obtained in the schizophrenia analysis.
The final aim of the study is to operationalize the identified insights into the genetic basis of SCZ and CAD to predict the individual level risk of each person for these diseases as well as other diseases and phenotypes with a strong genetic component that are associated with these illnesses.
For that purpose, we will build upon the combinations of genetic variants that we have identified in the first stage of our analysis and now use machine learning to predict disease/phenotype status based on genotype information. In addition, we will stratify the distinct patient groups according to the predicted, underlying biological mechanisms based on our method's predictions. To that end, we will use imputed gene expression levels based on the genotype. We will showcase the power of our approach on SCZ and highlight its generalizability by applying it also to several other diseases and endophenotypes that have a strong genetic component. These include SCZ, CAD as well as a subset of the ICD10 diagnoses and phenotypes associated with them based on our first set of analyses.