On this page you can read about all the ways we are working to meet the data security challenges of today and into the future.
The changing data landscape
Protecting the data
Find out how we protect our participants’ confidentiality while making data available for health research.
External audits
We undergo regular external audits.
We made a commitment to our participants to make their data available to researchers around the world while protecting their privacy and confidentiality.
Researchers have only ever had access to ‘de-identified’ UK Biobank data. This means that the data do not contain personally identifying information – such as participants’ names, addresses, exact date of births or NHS numbers. (Data that have been de-identified in this way are referred to as ‘pseudonymised’ data by the Information Commissioner‘s Office.)
But the data landscape has changed dramatically since we first made our data available in 2012. At that time, allowing researchers to download data was the only feasible way of sharing large-scale data for research. We based our security around de-identifying the data, our vetting process, and our contract with researchers and their institutions, within the ‘Five safes’ framework.
Since 2012, we have made significant changes to our processes and how we store and share data for health research.
The UK Biobank research platform
In 2018, with initial funding from the Medical Research Council, we began work to sequence the whole genome of every participant. This would create petabytes of data, far more than most researchers could download and store – storage costs alone would have been in the millions of pounds each year.
So, in 2020, we developed the world’s first cloud-based research platform able to handle the scale of data and numbers of researchers using UK Biobank. This launched in 2021 and meant that we could now start bringing researchers to the data, rather than the other way round.
In its infancy, the platform was suited to researchers working on genomic data. By 2024, we had added extra tools and sufficiently developed the platform to support the work of most researchers using UK Biobank. As well as increasing accessibility to the data for approved researchers, the platform also enhances security by giving us more control over who uses our data.
As a result, further releases of data are now only available to researchers on our platform (with a few carefully assessed exceptions). Researchers are not allowed to download data from the platform and must delete any data that they had received previously when their current projects come to an end (the last of which will end in early 2027).
When researchers run analyses of the data, they need to be able to download their findings from the platform. Our next goal is to develop an automated ‘airlock’ that can use AI to sift through these large files and assess whether they are results files or whether they may contain participant data. Nearly five thousand researchers use the platform each month, so to do these checks manually would take around 80 people working full-time.
Updates
We will continue to post updates on this page.
Online code repositories
Even though we only ever share de-identified data, we don’t want the data to be used by researchers who have not gone through our rigorous access review process.
Researchers use online repositories to publish their computer code to be transparent about how they have done their analyses, and so that other researchers can verify their results.
Increasingly during 2025, as scientific journals encouraged researchers to use online repositories, we faced the challenge of researchers inadvertently including de-identified participant data with the code.
We have taken multiple steps to stop this happening , including mandatory training for researchers, automated daily scans for data in the wrong place, and sanctions for anyone accidentally sharing data.
Our investigations have led us to suspend access to UK Biobank data for a major US university because one of its researchers was not giving sufficient care and attention to the data they were responsible for.
We are developing the world’s first automated checking system able to prevent de-identified participant data from being taken off the UK Biobank research platform, without preventing the important research that is being done by thousands of scientists around the world. We intend to have this automated system in place around the end of 2026.
We published a statement for our participants about this unintentional adding of de-identified data to repositories in March 2026.
Online listings
At the end of April 2026, UK Biobank identified that participant data were being offered for sale on a consumer website.
We are sorry this happened and an Oversight Committee has since carried out a full investigation. They have set out both what happened and the actions required to strengthen protection of participant data going forward.
We are fully committed to implementing all of the recommendations in the report to improve our systems and processes
Contact us
If you have any questions, or would like further information, please do not hesitate to get in touch on [email protected]
Related content
We made an important commitment to our participants when they joined the study. Find out how we uphold this commitment.
Learn more about how UK Biobank works including information on who can use UK Biobank data, ethics and regulations, and how we protect our data.