Using Topological Data Analysis for diagnosis pulmonary embolism

Matteo Rucco,Lorenzo Falsetti,Damir Herman,Tanya Petrossian,Emanuela Merelli,Cinzia Nitti,Aldo Salvi
DOI: https://doi.org/10.48550/arXiv.1409.5020
IF: 4.506
2014-09-17
Medical Physics
Abstract:Pulmonary Embolism (PE) is a common and potentially lethal condition. Most patients die within the first few hours from the event. Despite diagnostic advances, delays and underdiagnosis in PE are common.To increase the diagnostic performance in PE, current diagnostic work-up of patients with suspected acute pulmonary embolism usually starts with the assessment of clinical pretest probability using plasma d-Dimer measurement and clinical prediction rules. The most validated and widely used clinical decision rules are the Wells and Geneva Revised scores. We aimed to develop a new clinical prediction rule (CPR) for PE based on topological data analysis and artificial neural network. Filter or wrapper methods for features reduction cannot be applied to our dataset: the application of these algorithms can only be performed on datasets without missing data. Instead, we applied Topological data analysis (TDA) to overcome the hurdle of processing datasets with null values missing data. A topological network was developed using the Iris software (Ayasdi, Inc., Palo Alto). The PE patient topology identified two ares in the pathological group and hence two distinct clusters of PE patient populations. Additionally, the topological netowrk detected several sub-groups among healthy patients that likely are affected with non-PE diseases. TDA was further utilized to identify key features which are best associated as diagnostic factors for PE and used this information to define the input space for a back-propagation artificial neural network (BP-ANN). It is shown that the area under curve (AUC) of BP-ANN is greater than the AUCs of the scores (Wells and revised Geneva) used among physicians. The results demonstrate topological data analysis and the BP-ANN, when used in combination, can produce better predictive models than Wells or revised Geneva scores system for the analyzed cohort
What problem does this paper attempt to address?