Integrating TANBN with cost sensitive classification algorithm for imbalanced data in medical diagnosis

Dan Gan,Jiang Shen,Bang An,Man Xu,Na Liu
DOI: https://doi.org/10.1016/j.cie.2019.106266
2020-02-01
Abstract:For the imbalanced classification problems, most traditional classification models only focus on searching for an excellent classifier to maximize classification accuracy with the fixed misclassification cost, not take into consideration that misclassification cost can change with sample probability distribution. So far as we know, cost-sensitive learning method can be effectively utilized to solve imbalanced data classification problems. In this regards, we propose an integrated TANBN with cost-sensitive classification algorithm (AdaC-TANBN) to overcome the above drawback and improve classification accuracy. The AdaC-TANBN algorithm employs variable misclassification cost determined by samples distribution probability to train classifier, then implements classification for imbalanced data in medical diagnosis. The effectiveness of our proposed approach is examined on the Cleveland heart dataset (Heart), Indian liver patient dataset (ILPD), Dermatology dataset and Cervical cancer risk factors dataset (CCRF) from the UCI learning repository. The experimental results indicate that the AdaC-TANBN algorithm can outperform other state-of-the-art comparative methods.
computer science, interdisciplinary applications,engineering, industrial
What problem does this paper attempt to address?