Skip to navigation Skip to main content Skip to footer

Approved research

Building a Global Health Foundational Model using generative AI algorithms

Principal Investigator: Dr Daniel Martin-Herranz
Approved Research ID: 240581
Approval date: May 8th 2024

Lay summary

Generative Artificial Intelligence (AI) algorithms have already revolutionised our lives across domains such as text, image and video. For example, ChatGPT is one of the most successful technology launches of all time, helping millions of users around the world to increase productivity and brainstorm ideas through text generation. Applications like ChatGPT are built on top of Foundational Models, which have been trained in huge datasets and can be used as building blocks or pillars for downstream applications.

Our healthcare systems are currently mired in complexity, often relying on disjointed data, low interoperability, high administrative burden and slow innovation. The potential of AI to disrupt healthcare is tremendous, but there is a need to create a Foundational Model that is suitable for healthcare applications.

In this project we will develop the world's first Global Health Foundation Model by leveraging the latest methods in AI algorithms, huge computational power and infrastructure and combining de-identified health data from millions of volunteers across diverse datasets (including the UK Biobank). With this model, we aim to synthesize and understand health data on an unprecedented scale and to be able to generate synthetic health data with high fidelity while ensuring the highest standards of privacy and security.

All of this will lead to a transformation in healthcare delivery, risk assessment, and personal health insights. The applications of the Global Health Foundation Model are probably beyond what we can imagine, such as helping patients to better understand their medical results (e.g. imagine that you could talk with your Electronic Health Record in plain language) or supporting doctors to make better treatment decisions (e.g. by contextualising the latest laboratory test or radiology image, producing more accurate diagnoses and predicting the future evolution of a patient).

This research will benefit the health and wellbeing of society. By reducing the time and costs associated with health data access, we will enable more people to access and analyse data-driven insights (inc. researchers, innovators, medical professionals or epidemiologists). Furthermore, by providing the foundational block to train other models for specific healthcare applications, we will ultimately accelerate the impact of health advancements. Finally, by training the model on data from individuals with different genetic backgrounds and environmental contexts, we will ensure health equity is achieved and no one is left behind.