While genetic studies have identified thousands of genetic effects affecting almost all human traits, for the most part the causal path from DNA to disease is still unknown. Solving this problem, known as the “variant to function” problem, would immediately identify thousands of genes as causal for disease, with huge implications for drug development and patient treatments. In this proposal we will combine UK Biobank data with information from reference datasets, both public and those produced by consortia of which we are a part, to identify cellular processes which lie between DNA and disease. These processes include gene expression, translation into proteins and production of metabolites, as well as those related to genome function. We will look at environmental modifiers of genetic effects on cellular processes, such as age, sex, body mass index, anthropometric traits, diet, ethnicity, spirometry, comorbidities and prescription medication, and whether these translate into environmental modifiers of genetic effects on UK Biobank traits. In the first stage we will develop statistical models using datasets produced by the GTEx and IMI DIRECT consortia of phenotypes such as gene expression and protein levels. We will use these models to predict these phenotypes in UK Biobank, and test them for association with UK Biobank traits, to identify genes, tissues and proteins that have a causal influence on these traits. We will also apply methods such as transcriptome wide association studies, colocalisation and Mendelian Randomisation to compare genetic effects on cellular processes with genetic effects on disease, and to implicate genes and proteins in the development of disease. Finally, we will look at associations between molecular phenotypes measured directly in UK Biobank, such as proteomics and metabolites, and UK Biobank traits, to infer when these associations are consequences of disease and when they are causal.