A Machine Learning Aided Systematic Review and Meta-Analysis of the Relative Risk of Atrial Fibrillation in Patients with Diabetes Mellitus

Zhaohan Xiong,Tong Liu,Gary Tse,Mengqi Gong,Patrick Gladding,Bruce H. Smaill,Martin K. Stiles,Anne M. Gillis,Jichao Zhao
DOI: https://doi.org/10.3389/fphys.2018.00835
IF: 4
2018-01-01
Frontiers in Physiology
Abstract:Meta-analysis is a widely used tool to increase statistical power. However, the exponential growth of publications in key areas of medical science has rendered manual identification of relevant studies increasingly time-consuming. The aim of this work was to develop a machine learning technique capable of robust automatic study selection for meta-analysis. We have validated this approach with an up-to-date meta-analysis to investigate the association between diabetes mellitus (DM) and atrial fibrillation (AF). The PubMed database was searched from 1960 to September 2017 where 4,177 publications that mentioned both DM and AF were identified. First, publications were clustered based on common text features using an unsupervised K-means algorithm. Clusters that best matched the selected set of potentially relevant studies were then identified by using maximum entropy classification. The 139 articles selected automatically on this basis were screened manually to identify potentially relevant studies. To determine the validity of the automated process, a parallel set of studies was also assembled by manually screening all initially searched publications. Finally, detailed manual selection was performed on the full texts of the studies in both sets using standard criteria. Meta-regression random-effects models and sensitivity analysis were then conducted. Machine learning-assisted screening identified the same 29 studies for meta-analysis as those identified by using manual screening alone. Machine learning reduced the number of studies needed for manual screening from 4177 to 556 articles. A pooled analysis using the most conservative estimates indicated that patients with DM had ~49% greater risk of developing AF compared with individuals without DM. After adjusting for three additional risk factors i.e. hypertension, obesity and heart disease, the relative risk was 23%. Using multivariate adjusted models, the risk for developing AF in patients with DM was similar for all DM subtypes. Women with DM were 24% more likely to develop AF. The risk for AF in patients with DM has also increased over the years. We have developed a novel machine learning method to identify publications suitable for inclusion in meta-analysis. We have used it to demonstrate that DM is a strong, independent risk factor for AF, particularly for women.
What problem does this paper attempt to address?