Skip to navigation Skip to main content Skip to footer

Approved Research

Identification of determinant combinations of inflammatory arthritis using machine learning-based early detection methods.

Principal Investigator: Professor Weizi Li
Approved Research ID: 168875
Approval date: March 7th 2024

Lay summary

Over 20 million people in the UK live with rheumatic and musculoskeletal diseases (RMD) and inflammatory arthritis (IA) is a major subdivision of RMD causing joint inflammation leading to damage. IA causes long-term pain, disability and incurs substantial personal and societal costs. The estimated annual cost to the UK economy from sick leave and work-related disability for people with rheumatoid arthritis alone (RA, one type of IA) totals £1.8 billion. There is also an estimated 40% increase in diagnosed IA cases between 2004 and 2020 in the UK. There are still significant unmet needs in the IA patient pathway. IA presents with non-specific symptoms and there is currently no diagnostically definitive single biomarker for IA detection. Early detection is critical but challenging, and delay in detection and late referral often result in loss of the window of opportunity when effective treatment should start and can lead to disability and associated unemployment. For example, approximately half of the early IA patients are not referred and treated within the ideal timeframe. There are also multiple factors including patients' genetics, biological, socioeconomic situation, environment and weather affecting IA development and progression. This has made accurate detection even more difficult. For example, only 40% of suspected early IA diagnoses referred by GPs in 2019/2020 proved to be accurate. Rheumatology clinicians in secondary care will need to assess those referrals within three weeks but fewer than half of hospitals can achieve target time according to National Early Inflammatory Arthritis Audit (NEIAA). Inaccurate referrals can lead to longer times for patients to gain access to the right clinics.

Although there are studies showing potential determinants of IA, there is no research, or any machine learning methods that can identify the undetected determinants-combination that can offer a useful level of prediction of IA.

In this project we will develop a novel multimodal representation learning method that learns underlying relationships between data types and modalities of rheumatic and musculoskeletal disease patients in Biobank. It is the first-of-its-type ML research in the rheumatology discipline that combines multiple data sources for IA early detection and develops real-world evidence to inform future clinical practice. Our novel multimodal representation learning method will enable scalable multi-modal machine learning that learns the underlying relationships from multimodal data. This means our model will discover complex IA determinants in the real world than existing multimodal methods that require complete data in all modalities.