Linear discriminant analysis and principal component analysis to predict coronary artery disease

Carlo Ricciardi,Antonio Saverio Valente,Kyle Edmund,Valeria Cantoni,Roberta Green,Antonella Fiorillo,Ilaria Picone,Stefania Santini,Mario Cesarelli
DOI: https://doi.org/10.1177/1460458219899210
2020-01-23
Health Informatics Journal
Abstract:Coronary artery disease is one of the most prevalent chronic pathologies in the modern world, leading to the deaths of thousands of people, both in the United States and in Europe. This article reports the use of data mining techniques to analyse a population of 10,265 people who were evaluated by the Department of Advanced Biomedical Sciences for myocardial ischaemia. Overall, 22 features are extracted, and linear discriminant analysis is implemented twice through both the Knime analytics platform and R statistical programming language to classify patients as either normal or pathological. The former of these analyses includes only classification, while the latter method includes principal component analysis before classification to create new features. The classification accuracies obtained for these methods were 84.5 and 86.0 per cent, respectively, with a specificity over 97 per cent and a sensitivity between 62 and 66 per cent. This article presents a practical implementation of traditional data mining techniques that can be used to help clinicians in decision-making; moreover, principal component analysis is used as an algorithm for feature reduction.
health care sciences & services,medical informatics
What problem does this paper attempt to address?