Effective disease age of onset assessment for individuals that are carriers of variable penetrance genes, using machine learning methods.
Principal Investigator:
Dr Michal Golan-Mashiach
Approved Research ID:
50913
Approval date:
January 14th 2020
Lay summary
Many individuals with a particular disease-causing mutation or genotype do not express the disease phenotype, a phenomenon that is known as 'incomplete penetrance'. There are many genetic diseases where genotypes can not predict both the disease's age of onset and the actual occurrence of the respective disease itself. For example, a germline mutation in BRCA1/2 (BRCA1/2+) is the most penetrant genetic predisposition for breast cancer. Females with BRCA1/2+ pathogenic variants have up to an 87% risk of developing an associated cancer, while males have up to a 20% risk. In general, the mutations are known to have incomplete penetrance, with some patients developing the disease in their thirties and a small portion of carriers surviving to their ninth decade without developing symptoms. Therefore in many cases, and especially in hereditary cancer, a simple genetic screening is not enough. Using advanced machine learning methods and deep learning techniques we will combine genotype and phenotype data with numerous external variables in order to produce accurate estimations of the aforementioned outcomes. By integrating genetic data with a variety of environmental factors such as lifestyle, medical conditions, exposures, and physical measures, our goal is to generate a health calculator that can effectively predict an individual's disease risk and age of onset, with an emphasis on hereditary cancers or other incomplete penetrance diseases. It is our belief that with a comprehensive enough database constructing such a tool can be done within 24 months. Individuals receiving such information will benefit from higher quality genetic counseling, prophylactic treatment, and early diagnosis, while also more effectively adjusting their personal lifestyles in order to reduce the risk of contracting the disease. In addition, such information can greatly contribute to clinical research and provide insight into the complex biological interactions that lead to disease occurrence and severity.