Approved Research

Deep Learning models for predicting phenotypic quantitative traits from high density genotyping data

French National Research Institute for Agriculture, Food and Environment (INRAE)

Lay summary

Aims

The use of genotyping methods is becoming a routine analysis in predictive medicine (oncology), genealogy research, Gene Wide-Association Study in diseases or psychological traits, and also in another field : animal genetic which is our research topic for production and health. The aims of our project is develop Deep Learning (DL) or Machine Learning (ML) models to predict quantitative phenotypes by using the genotyping data of the individuals.

Scientific rationale

The big amount of genotyping data overreached the computation capacity of the classical methods of phenotype prediction. In order to improve the predictive models of quantitative phenotypes, we propose to apply DL/ML methods more adapted to the big data and non-linear response.

Project

The development of these DL models needs a lot of data to be trained in a first step before being able to predict in a second step. We will implement DL/ML models i) to generate artificial genotypes and phenotypes using generative adversial neural networks model (GAN) and then ii ) to combine real and artificial data to train a DL/ML models to predict quantitative phenotypes by using the genotyping data of the individuals.

Impact

The generative model could be useful to produce artificial data genotypes-phenotypes nearly similar to real data but really anonymous that can be used for other statistical analysis, GWAS or models. The optimisation of these DL/ML models applied to genomics could be a determinant progress for many applications in human medicine and animal genetics.