Skip to navigation Skip to main content Skip to footer

UK Biobank launches innovative cloud-based research analysis platform to vastly increase scale and accessibility of the world’s most comprehensive biomedical database

Innovative cloud-based Research Analysis Platform launched to increase scale and accessibility of resource

28 September 2021 – UK Biobank, a large-scale biomedical database and research resource, announces today the launch of a uniquely powerful and innovative research platform that allows approved researchers to access and analyse the entire UK Biobank database securely, in the cloud, from anywhere in the world. It is designed to accommodate the vast and increasing scale of genomics and healthcare data pertaining to UK Biobank’s 500,000 participants.

This new Research Analysis Platform (RAP), enabled by DNAnexus and powered by Amazon Web Services (AWS), exponentially increases the scale and accessibility of the world’s largest and most comprehensive biomedical database for researchers around the world[1] to advance understanding of human disease.

"UK Biobank’s unique, biomedical database – the most detailed and accessible research resource of its kind – is already a vital resource for the global medical research community. We are hugely excited by the potential of RAP to further increase the speed and scale of scientific discoveries and to democratise access to enable the brightest minds across the world to help improve human health."

Mark Effingham, Deputy CEO of UK Biobank

A revolutionary step forward
Whilst the growth of the UK Biobank resource has created an unparalleled opportunity for medical research there have been some limitations with collating, storing, and accessing data at this scale. The new cloud-based platform removes these barriers and will accelerate both the speed and scale of health-related research.

Democratising access
Up until now, approved researchers have had to download de-identified participant data to conduct their research analyses, requiring significant local storage space, computing power, and technical resources. The new RAP replaces these time-intensive data downloads by allowing researchers to bring their analyses to the data and providing them with secure and rapid access to flexible and scalable cloud computing. This makes vital research more accessible and cost-effective for a broader range of researchers around the world.

To democratise access to the resource even further, AWS has kindly pledged $1.5 million in research credits to support access for more researchers from low- and middle-income countries and early career researchers.

An increasingly scalable database
UK Biobank currently contains 11 petabytes of data and is expected to grow to over 40 petabytes by 2025. To put the vast scale of these data into perspective, it would take over a century of continuous viewing to consume 40 petabytes worth of high definition 4k movies.

The new platform is uniquely designed to accommodate the increasing scale of the UK Biobank resource, and will allow researchers to access all of the data included in the resource, including the whole exome[1] and whole genome sequencing data[2], which are stored securely to the highest industry and professional standards. This will enable more extensive research into the precise genetic determinants of disease, including the impact of rare genetic variants.

A more collaborative resource
Improving the world's health is complex, which is why UK Biobank fosters collaboration between experts from academia, industry, charity, and government to support advances in health research. The new platform resolves the complexity associated with integrating the genomics and clinical data within the database. It also enables greater collaboration between researchers around the world by allowing users to analyse multiple data types together and to work on the same research project within the cloud-based platform.

"“DNAnexus is proud to partner with UK Biobank on this landmark initiative that combines our leading biomedical informatics platform and insight tools with UK Biobank’s genomics and clinical datasets. This global collaboration brings us one step closer to accomplishing our vision of democratising data access to drive innovations in research that profoundly impact patient lives.”"

Richard Daly, Chief Executive Officer at DNAnexus

Thanks to funding provided by Wellcome, in collaboration with the Medical Research Council (MRC), the new platform will increase the speed and scale of research into the diagnosis, treatment and prevention of the most devastating diseases, benefiting millions of people in the UK and around the world.

"“The UK Biobank Research Analysis Platform represents a unique cloud-based location for researchers from across the globe to access, analyse and interrogate UK Biobanks data. It removes the complex computing barrier that some researchers from low- and middle-income countries face in downloading and analysing UK Biobanks data, generating a truly accessible platform for all researchers. Wellcome is excited to support the launch of this platform, to help create a more equitable research environment and better health outcomes for everyone.”"

Michael Dunn, Director of Discovery Research at Wellcome

"“This new research analysis platform will enable greater global access to UK Biobank and facilitate even greater understanding of why some people develop particular diseases and others do not. MRC is proud to support UK Biobank, an organisation that is constantly pushing the boundaries of science and technology, and we are excited to see how this powerful new research tool will enable more research and ultimately lead to improvements in human health.”"

Fiona Watt, MRC Executive Chair

Notes to Editors
• [1] DNAnexus has developed the new cloud-based RAP, which is held securely in the UK by AWS on behalf of UK Biobank, to enable more researchers to access increasing amounts of large, diverse biomedical data. Key platform benefits include:

o Enable multi-omics and clinical data collection, curation, and analysis
o Leverage a library of standard tools to deploy novel analytic methods for population-scale genomic discovery
o Build and scale the most secure and compliant cloud-based data management infrastructure specifically designed for the healthcare and life sciences industry
o Enable democratised data access to drive collaborative research for scientific and medical discovery
o Provide easy access to new data as the UK Biobank data set grows in diversity and scale

• [2] Whole-exome sequencing (WES) measures the regions of the genome (about 2%) that are involved in coding for proteins and is particularly suitable for identifying disease-causing and/or rare genetic variants. Exome sequencing for 50,000 participants has been performed by Regeneron and GlaxoSmithKline. A further consortium is undertaking exome sequencing on the remaining 450,000 participants. WES data for 300,000 participants are now available to researchers with a further 150,000 expected in Autumn 2021.

• [3] Whole genome sequencing (WGS) measures the entire genome and will provide information that will complement and enhance the existing genotyping and exome data. It is the biggest endeavour of its kind ever undertaken and will transform the way in which scientists study the genetics determinants of a wide range of health outcomes. WGS data for 200,000 will be available on the RAP in Q4 2021 with the remainder expected to be made accessible to researchers in early 2023.