Skip to navigation Skip to main content Skip to footer

Using GP data of UK Biobank participants

The success of UK Biobank and our ability to enable research depends on the extent to which we can access information about health outcomes in UK Biobank participants. We do this in accordance with the consent that UK Biobank participants gave when they first agreed to participate. They volunteered to share all their health-related records, along with other information about themselves for health-related research even after their incapacity or death. 

Accessing GP data 

For many years, UK Biobank has received updates from the NHS about participant hospitalisations, cancer diagnoses, deaths and use of other central NHS services.  

However, we have been unable to access data which are under the control of GPs, which would greatly increase the value of the resource for researchers, despite participants providing consent for this when they first joined the study. See Participant Consent Form.

This is because GP data are not held centrally by the NHS, but instead are controlled by each individual GP practice. We have not been able to receive approval from many of these practices for the release of these data, because they are either too busy, unsure about the rules on data sharing, unwilling or simply not aware of its importance for research purposes.

UK Biobank can assure GPs regarding consent, data governance and protection (see below) and can ensure that GP practices are not exposed to liability. Additionally, we know that GPs face large workloads so have ensured that the act of sharing data is a quick administrative task.

Participants have given explicit consent for us to access all their medical and health-related records, and we have been receiving data on cancers and hospitalisations from the NHS for years. Despite our previous efforts over the years, we’ve only had the GP data during the pandemic, which showed us its importance for understanding how to prevent and treat disease. Participants are, understandably, surprised and disappointed and we want to work with GPs so they’re happy to push the button, which they can always ‘un-push’ if they ever wish to. We are also happy to indemnify GPs for sharing the data, and cover their administration costs.

Professor Sir Rory Collins, Principal Investigator and CEO of UK Biobank 

Benefits of using GP data 

Linking UK Biobank data to GP data will transform the scientific value and clinical relevance of the research questions that can be addressed using the UK Biobank resource, particularly for diseases that are largely managed in primary care (such as diabetes, dementia, arthritis, chronic pain and many mental health conditions).

During the pandemic, emergency legislation made it possible for UK Biobank to access the GP data of participants solely for the purpose of COVID-19 research. This access demonstrated just how valuable the GP data can be for research using the UK Biobank resource. For example, it enabled identification of various factors (such as obesity, prior acute kidney failure, and previous infections) that increase the risk of a person having severe COVID-19, as well as revealing changes to brain structure following infection. 

This boost to research also demonstrated what has been lost during the past decade as a result of our inability to access GP data for research into other health conditions. 

Allowing UK Biobank to access consented, de-identified GP data is vitally important as it will substantially enhance the research that will be achieved and enable a much wider range of important health findings to emerge.  

Example of data gaps from lack of access to GP records  

Three overlapping circles. The largest is labelled ‘48% of cases are recorded only by GPs’. The next largest, which overlaps a small amount with the first, is labelled ‘Cases recorded via hospital admissions data'. The third, most of which overlaps with both circle 1 and circle 2, is labelled ‘Cases recorded via death certificates’.

Venn diagram illustrating the percentage of diagnoses for Alzheimer’s disease recorded in GP records, hospital admissions data and death certificates. Source data: UK Biobank participants in England

Alzheimer’s disease 

Almost half (48%) of the diagnosis information for Alzheimer’s disease is held only in GP records.  

If this information isn’t accessible, it not only reduces the amount of data available to power research, but it also means that the research uses cases which are likely to be further progressed, and therefore less useful for research into the causes of Alzheimer’s. 

Percentages of data held by location: 

  • 48% of cases recorded only by GPs 
  • 27% of cases recorded only in hospital admissions data  
  • 1.5% of cases recorded only on the death certificate
Two overlapping circles. The largest is labelled ‘50% of cases are recorded only by GPs’. The second, which overlaps a small amount with the first, is labelled ‘Cases recorded via hospital admissions data’.

Venn diagram illustrating the percentage of diagnoses for depression recorded in GP records and hospital admissions data. Source data: UK Biobank participants in England


Half of the diagnosis information for depression is held only in GP records, so researchers are missing out on analysing the data of half of the UK Biobank participant cohort diagnosed with depression.  The cases researchers can analyse will be skewed towards severe depression as the profile of people diagnosed with depression upon hospital admission is more severe.

There is evidence than severe depression has a different genetic profile to mild depression, reducing the extent to which research in the hospital-diagnosed participants can be applied to people diagnosed by their GP. 

Percentages of data held by location: 

  • 50% of cases recorded only by GPs 
  • 36% of cases recorded only in hospital episode statistics 
  • 14% of cases are recorded in both GP records and hospital episode statistics 

Data protection 

GPs are rightly cautious about sharing their patients’ data. UK Biobank has investigated this carefully and can give the following assurances: 

· The Information Commissioner has confirmed that the participants’ existing explicit written consent is compliant with current data protection legislation and supports this approach. 

· UK Biobank has already been making de-identified NHS health outcome data (e.g. deaths, cancers, and hospitalisations) available to approved researchers for the past 10 years. 

· The same strict data protection and information governance processes would be applied for access to the primary care data, which would be restricted to coded information. 

· This request is endorsed by the Royal College of GPs and NHS England. 

With support of:

UK Biobank fully appreciates that GP practices take their responsibilities as a data controller very seriously and are rightly concerned about their potential risks and exposure involved in making this data available to another data controller (UK Biobank in this case). 

In this regard, UK Biobank has provided a number of practical assurances, including that only de-identified coded data will be accessible by researchers and that the data will remain on the UK Biobank Research Access Platform (where the data is stored within the UK, on the AWS cloud).

Further, UK Biobank would be prepared to take on the full responsibility for the appropriate and lawful use of the primary care data by UK Biobank and its researchers, such that UK Biobank would assume responsibility for any regulatory issues and also ensure that GP practices would not carry any financial exposure: an effective indemnity.

What we would like GPs to do 

We recognise that GPs are already extremely busy, so we have ensured that the administrative task they will need to complete is straightforward and can be completed in less than a minute. The relevant data – which only includes coded health data and no written notes/free-text or attached letters – will be transferred securely to UK Biobank using proven systems, appropriately de-identified (as we do with all participants’ information) and incorporated into the resource.

These de-identified data will then be made available to researchers around the world for health research that is in the public interest. 

UK Biobank will cover costs for a GP practice to release coded, de-identified data for consented participants. 

If a GP practice decides to stop sharing the consented participant data, then they can un-tick the box and no further data will be shared. A participant can, at any time, ask to withdraw their data from UK Biobank. 

For GP practices running EMIS (see right hand image):

The data sharing agreement is located under the 'Configuration' panel (see 1. right) and within the 'Data Sharing Manager' (see 2. right) and is available to the GP practice administrator.

Click on 'My Agreements' (see 3. right) and select the UK Biobank agreement within the 'Data Distribution' tab (see 4. right). Click on 'Activate Agreement' (see 5. right).




For GP practices running TPP (see below and right hand images):

Please note that you will need 'system administrator' access rights to make the following changes. To access your organisation preferences, go to 'Setup' and then to 'Users & Policy.' 





To see the data sharing agreement located within the 'Organisation Preferences' panel in SystmOne (see 1. right), select the 'Research' element (see 2. right).

To activate the agreement, a GP practice simply needs to read the brief summary and then to tick the checkbox marked 'Opt-in to UK Biobank' (see 3. below).

How do I know which of my patients are UK Biobank participants?

Unless a patient has mentioned their participation to you directly, it is unlikely that you would know which of them is in UK Biobank. As such, if you would like to receive a list of patients in your practice who are in UK Biobank please email ( including your Practice ID in the message.  We will then arrange for a list to be provided to you. Please bear in mind that, although we can guarantee that all the individuals on the list are consented UK Biobank participants, the practice registration may not be up-to-date (e.g. if a patient has recently moved to another practice).

Letter to GPs – September 2023 

In September 2023 we sent a joint letter, co-signed by UK Biobank, the Royal College of General Practitioners and NHS England, to all GP practices in England asking them to share the relevant data with us.   

A first publication of the letter mistakenly referred to the British Medical Association (BMA) as endorsing the request.  The error was corrected immediately. We hope that the BMA will be able to support the process explicitly in the future. 


For participants 

Our participants don’t need to take any action. Your GP practice may contact you to check that you are a UK Biobank participant. 

Scotland and Wales 

This current page only applies to GPs in England, as we have had (and continue to have) access to GP data from data providers in Scotland and Wales.  

Contact details

For further information please email:

Last updated