Principal Investigator: Dr Manuel Rivas
Department: Biomedical Data Science
Stanford University, Biomedical Data Science, 365 Lausen Street, Third Floor Littlefield, Stanford CA 94305, United StatesTags: 24983, genetics, High-dimensional methods, Learning, Therapeutics
Lead Collaborators: 1) Dr Matti Pirinen
Collaborating Institutions and Addresses:
- University of Helsinki, Institute for Melecular Medicine Finland, Human Genetics, Biomedicum Helsinki 2U, Tukholmankatu 8, Helsinki 00014, Finland
Internally funded by Stanford University’s start-up fund
1a: This proposal seeks access to UK Biobank data to support efforts to generate effect therapeutic hypotheses from genomic and hospital in-patient data. We have developed novel statistical methods to assess the impact of genetic variation across a broad range of disease outcomes. We plan to take advantage of the tree structure of the ICD-10 codes to improve inference. By doing so we hope to prioritize genetic effects that are consistent with a protective profile. This will result in a set of therapeutic hypotheses that academics, pharmaceutical companies, and the public may be able to pursue.
1b: The research we plan is in agreement with the stated aim of UK Biobank “research intended to improve the prevention, diagnosis and treatment of illness and the promotion of health throughout society”.
By communicating to the public the set of therapeutic hypotheses we can generate from the data that has been generated by UK Biobank we hope that this will expedite interest in drug development from these insights.
1c: We will combine assessments of genetic associations with the tree-structure of ICD-10 codes and apply new statistical learning techniques to the summary data.
A special class of genetic variants that we will focus on are protein-truncating variants (PTVs), commonly referred to as loss-of-function variants. Scanning for protective PTVs has been a successful strategy. These protective genetic variants reveal a process that is safe (naturally occurs in healthy adults) and effective (proven to reduce risk of disease).
1d: The full cohort.
We will use deep learning techniques to derive new features from the bulk field and assess how they are related to genetic variants that we prioritize as putative protective alleles or genetic variants that modify disease risk.
This proposal seeks access to UK Biobank data to support efforts to generate effective therapeutic hypotheses from human genomic and phenotype data across UK Biobank. We have developed novel statistical methods to assess the impact of genetic variation across a broad range of disease outcomes, and non-disease phenotypes. We have developed methods to compute genetic parameters across phenotypes including heritability, genetic correlation, and polygenic risk scores. We plan to take advance of several features of the UK Biobank including access to ICD-10 codes, verbal questionnaire data, online questionnaire, biomarkers, imaging features, blood based meaures, eye measurements, and anthropometric traits to improve inference. More specifically we plan to include analysis of data fields from physical measures (Category 100006), Verbal Interview (Category 100071), Touchscreen including family history and self reported disease (Category 100025), Cognitive function (Category 100026), Imaging derived features (Category 100003) and OCT fundus images eye images (Category 100016, Category 100013), Health Related outcomes (Category 100091), Digestive Health, and Mental Health data from the Online follow-up (Category 100089), with the Genomics data (Category 100314),
By doing so we hope to prioritize modifiers of disease and get a better sense of the prediction utility of genomic risk scores. This will result in a set of therapeutic hypotheses that academics, pharmaceutical companies, and the public may be able to pursue. Finally, we plan to host all of our summary results as part of the Global Biobank Engine (https://biobankengine.stanford.edu).
Last updated Apr 24, 2019