Last updated Apr 2, 2020
Collaboration between UK Biobank, Regeneron and GSK delivers significant new data and genetic insights
A vast tranche of new UK Biobank genetic data becomes available to health researchers today, offering an unprecedented resource to enhance understanding of human biology and aid in therapeutic discovery.
The exome sequence data of 50,000 UK Biobank participants were generated at the Regeneron Genetics Center through a collaboration between UK Biobank, Regeneron (US) and GSK (UK) and are linked to detailed health records, imaging and other health-related data. Regeneron is also leading a consortium of biopharma companies (including Abbvie, Alnylam, AstraZeneca, Bristol-Myers Squibb, Biogen, Pfizer and Takeda) to complete exome sequencing of the remaining 450,000 UK Biobank participants by 2020. In addition, GSK has committed a £40 million investment to initiatives, such as UK Biobank, that harness advances in genetic research in the development of new medicines.
Consistent with the founding principles of UK Biobank, the first tranche of data has now been incorporated back into the UK Biobank resource for the global health research community to use. It follows a brief exclusive research period for Regeneron and GSK. Additional tranches of data will similarly be released over the next two years. All sequencing and analyses activities are undertaken on a de-identified basis, with the utmost consideration and respect for participant privacy and confidentiality principles.
This major enhancement to UK Biobank would have been unimaginable when the study began recruiting participants in 2006, and makes it one of the most important studies of population health in the world. It represents huge leverage of the public and charity investment that has supported UK Biobank up to this point; the costs of such a project would have been prohibitive had UK Biobank had to raise the funding itself.
“We believe this is the largest open access resource of exome sequence data linked to robust health records in the world – and this is just the beginning. There is so much actionable information in this resource that can be utilized by scientific minds around the globe. We are hard at work mining the data for novel findings that will accelerate science, innovative new medicines and improved patient care, and are excited to have others join us in this important quest.”Aris Baras, MD, Senior Vice President and Head of the Regeneron Genetics Center.
The exome makes up 1-2 percent of a human genome and contains the protein-coding genes. It is this area that scientists believe has most relevance for discovering genetic variants that may inform the discovery and development of new and improved medicines. The exome sequencing work supports other UK Biobank genetics analyses under way, including whole genome sequencing of 50,000 participants funded by the UK Research and Innovation as part of the Industrial Strategy Challenge Fund.
“We strongly support the UK’s life sciences strategy, and this is a great example of what can be achieved by all parts of the sector working together to make sure the UK remains at the cutting-edge of research. Genetics is playing an increasingly important role in research, and by generating and now integrating these exome data, the UK Biobank has some of the richest health and genetics data available for use by the broader scientific community to enhance their understanding and research effort. We expect this will ultimately lead to more scientific break-throughs that can improve health.”Tony Wood, SVP, Medicinal Science and Technology, GSK.
A preprint of a manuscript by researchers at Regeneron and GSK describing their findings from examination of the first 50,000 exomes is available on biorxiv.org. Key findings included novel loss of function associations with large effects on disease risk, including between PIEZO1 and varicose veins, MEPE and bone mineral density and osteoporosis, COL6A1 and ocular traits, and IQGAP2 and GMPR associated with blood cell traits.
GSK and Regeneron have significant expertise in genomics. The sequencing was performed by the Regeneron Genetics Center (RGC) in New York state, one of the world’s largest human genetics sequencing and research programs. The Regeneron Genetics Center is currently sequencing at a rate of 500,000 exomes per year, and Regeneron has advanced multiple new targets and development programs based on its genetics discoveries. GSK is increasingly incorporating the almost daily advances in genetics and genomics into its drug research programmes, forming collaborations and working closely with other world-leading organisations such as 23andMe, Open Targets and Altius.
Professor Fiona Watt, Executive Chair of the UK Medical Research Council, which has funded UK Biobank since its inception and continues to support enhancement activities, said it was very pleasing to see industry and academia tackling health research together.
“UK Biobank was established to do science in new ways. Industry has led the way on this exome sequencing project and the fruits of that work mean UK Biobank can now deliver important genetic data that would otherwise not be available to researchers.”
Wellcome, which also funds UK Biobank, thanked the many people who made this work possible. Sara Marshall, Head of Clinical Research & Physiological Sciences at Wellcome, said:
“Today’s announcement proves the immense value of the UK Biobank and we look forward to seeing many new collaborations between UK Biobank, industry and academia on the back of this new data being released. The success of UK Biobank is thanks to the 500,000 people who have generously agreed to have their lives studied for years.”
Professor Sir Rory Collins, UK Biobank’s Principal Investigator, encouraged approved researchers to use the data. “We are excited about the possibilities of letting loose the imaginations of scientists from around the world on these large-scale genomic data linked to so much detailed information related to health in the 500,000 UK Biobank participants,” he said.
UK Biobank has also updated a range of other health information on its 500,000 participants. This includes updates of hospital, cancer and death data, and blood biomarkers. New disease-related algorithms are provided such as on chronic obstructive pulmonary disease, kidney disease, dementia and Parkinson’s disease and the stroke and heart attack algorithms have been updated.