Principal Investigator: Dr Anthony Webster
Institution: University of OxfordTags: 42583, ageing, classification, clustering, disease
Patterns of disease within a population can indicate common causes. These causes can either be avoided, or treatments can be developed for them. One pattern of disease would be an increased risk of asthma in an area of high pollution. This project explores whether the time from a person’s birth to the occurrence of a disease can be combined with other medical information to identify the existence of high-risk groups, or to better understand the processes by which diseases progress. This is possible through a combination of “big data” and emerging statistical methods such as “machine learning”.
Our primary aim is to search for new links between diseases. We will develop a new classification of diseases, with the expectation of providing new insights and a deeper biological understanding of links between them. Such insights have previously led to new lines or research, better treatment, and improved prevention.
A secondary aim is to identify groups within the population that are especially susceptible to particular diseases. The existence of higher-risk groups would prompt further work to identify individuals at risk and to modify advice.
The research is made possible by a combination of large data sets (“Big Data”), and emerging modern statistical methods. Older methods will be combined with new techniques to make full use of the benefits of big data. The majority of time (roughly 60%) will be used to develop, implement, and optimise new methods, possibly with updated studies if new data becomes accessible. The rest will be used to explore the consequences of the results, and to report them. The project will run for an initial period of 36 months, longer if results prompt further studies, and if funding permits it.
The project’s impact is likely to be felt through new lines of research to understand and tackle disease, and indirectly through subsequent use of new methods. If high-risk groups within the population are identified, then it may influence future medical advice and diagnosis.
Project extension February 2019:
Time of disease incidence, risk factors, and co-occurring diseases, will all be used to understand what the present disease classification system captures, and to suggest a new or modified system. As before, a combination of mature methodologies such as Cox proportional hazards will be combined with more modern “machine learning” methodologies to identify clustering of diseases.
The primary interest is in well-recognised risk factors such as height, BMI, and smoking, for example. As noted in the original application, the influence of genetics or less commonly studied risk factors may be important. These will be studied as required, but this is most likely later in the project (years 2 or 3).
Last updated Mar 12, 2019