Evaluating the Efficacy of Machine Learning Models in Credit Card Fraud Detection

Gregorius Airlangga
DOI: https://doi.org/10.47709/cnahpc.v6i2.3814
2024-05-28
Abstract:This research evaluates the effectiveness of various machine learning models in detecting credit card fraud within a dataset comprising 555,719 transactions. The study meticulously compares traditional and advanced models, including Logistic Regression, Support Vector Machines (SVM), Random Forest, Gradient Boosting, k-Nearest Neighbors (k-NN), Naive Bayes, AdaBoost, LightGBM, XGBoost, and Multilayer Perceptrons (MLP), in terms of accuracy and reliability. Through a robust methodology involving extensive data preprocessing, feature engineering, and a 5-fold stratified cross-validation, the research identifies XGBoost as the most effective model, demonstrating a near-perfect mean accuracy of 0.9990 with minimal variability. The results emphasize the significance of model choice, data preparation, and the potential of ensemble and boosting techniques in managing the complexities of fraud detection. The findings not only contribute to the academic discourse on fraud detection but also suggest practical applications for real-world systems, aiming to enhance security measures in financial transactions. Future research directions include exploring hybrid models and adapting to evolving fraud tactics through continuous learning systems.
What problem does this paper attempt to address?