Building prediction models and discovering important factors of health insurance fraud using machine learning methods
Venkateswarlu Nalluri,Jing-Rong Chang,Long-Sheng Chen,Jia-Chuan Chen
DOI: https://doi.org/10.1007/s12652-023-04633-6
IF: 3.662
2023-05-20
Journal of Ambient Intelligence and Humanized Computing
Abstract:Health insurance fraud accounts for 3–10% of total medical expenditures every year. If the growth of fraud activities is allowed, it will cause irreversible consequences to the medical system. However, medical-related data is too large and complex, and it is difficult to process such a large amount of data with traditional statistical methods. Therefore, machine learning algorithms have become one of important solutions. When faced with different data, whether the learning method can maintain its stability and give a more appropriate answer is a big question. Many related studies focused on medical insurance fraud and assessment, but few studies attempts to discover the important factors of medical fraud, and find optimal machines learning method. Therefore, this study used two unpublished datasets that might discover novel knowledge, and four machine learning methods, including Support Vector Machines (SVM), Decision Trees (DT), Random Forest (RF) and Multilayer Perceptron (MLP) to find the best machine learning method that can effectively detect medical fraud. From results of DT, we also extracted 19 crucial characteristics of medical insurance fraud, and grouped them into 4 categories, which are medical service providers, applied insurance claims amount, Healthcare Common Procedure Coding System (HCPCS), and beneficiary. Results of experiments could provide valuable suggestions for insurance management to establish an automatic audit mechanism to eliminate medical frauds.
computer science, information systems,telecommunications, artificial intelligence