Abstract:Abstract In recent years, credit card transaction fraud has resulted in massive losses for both consumers and banks. Subsequently, both cardholders and banks need a strong fraud detection system to reduce cardholder losses. Credit card fraud detection (CCFD) is an important method of fraud prevention. However, there are many challenges in developing an ideal fraud detection system for banks. First off, due to data security and privacy concerns, various banks and other financial institutions are typically not permitted to exchange their transaction datasets. These issues make traditional systems find it difficult to learn and detect fraud depictions. Therefore, this paper proposes federated learning for CCFD over different frameworks (TensorFlow federated, PyTorch). Second, there is a significant imbalance in credit card transactions across all banks, with a small percentage of fraudulent transactions outweighing the majority of valid ones. In order to demonstrate the urgent need for a comprehensive investigation of class imbalance management techniques to develop a powerful model to identify fraudulent transactions, the dataset must be balanced. In order to address the issue of class imbalance, this study also seeks to give a comparative analysis of several individual and hybrid resampling techniques. In several experimental studies, the effectiveness of various resampling techniques in combination with classification approaches has been compared. In this study, it is found that the hybrid resampling methods perform well for machine learning classification models compared to deep learning classification models. The experimental results show that the best accuracy for the Random Forest (RF); Logistic Regression; K-Nearest Neighbors (KNN); Decision Tree (DT), and Gaussian Naive Bayes (NB) classifiers are 99,99%; 94,61%; 99.96%; 99,98%, and 91,47%, respectively. The comparative results show that the RF outperforms with high performance parameters (accuracy, recall, precision and f score) better than NB; RF; DT and KNN. RF achieve the minimum loss values with all resampling techniques, and the results, when utilizing the proposed models on the entire skewed dataset, achieved preferable outcomes to the unbalanced dataset. Furthermore, the PyTorch framework achieves higher prediction accuracy for the federated learning model than the TensorFlow federated framework but with more computational time.

What problem does this paper attempt to address?

The paper "Federated Learning Model for Credit Card Fraud Detection with Data Balancing Techniques" addresses the significant challenge of credit card fraud detection (CCFD) in the context of modern electronic services and the rapid increase in credit card transactions. The key problems and contributions of the paper can be summarized as follows: ### Problems Addressed: 1. **Data Security and Privacy Concerns**: Banks and financial institutions are typically not allowed to share their transaction datasets due to data security and privacy concerns. This makes it difficult for traditional systems to learn and detect fraud. 2. **Class Imbalance**: There is a significant imbalance in credit card transactions across all banks, with a small percentage of fraudulent transactions far outnumbered by legitimate ones. This imbalance makes it challenging for predictive models to find patterns in the data from the minority (fraudulent) class. ### Contributions: 1. **Federated Learning Approach**: The paper proposes a federated learning approach to enable different banks to collaboratively train a fraud detection model without sharing raw data. This approach allows financial institutions to benefit from a shared global model that has seen more fraud than each bank alone, thereby improving fraud detection accuracy while maintaining data privacy. 2. **Resampling Techniques**: The paper investigates several individual and hybrid resampling techniques to address the class imbalance problem.

Federated learning model for credit card fraud detection with data balancing techniques

CCFD: Efficient Credit Card Fraud Detection Using Meta-Heuristic Techniques and Machine Learning Algorithms

An efficient fraud detection framework with credit card imbalanced data in financial services

Unbalanced Credit Card Fraud Detection Data: A Machine Learning-Oriented Comparative Study of Balancing Techniques

Class balancing framework for credit card fraud detection based on clustering and similarity-based selection (SBS)

Development of Deep Learning based Intelligent Approach for Credit Card Fraud Detection

Credit Card Fraud Detection Using Enhanced Random Forest Classifier for Imbalanced Data

A soft voting ensemble learning approach for credit card fraud detection

DEAL – ‘Deep Ensemble ALgorithm’ Framework for Credit Card Fraud Detection in Real-Time Data Stream with Google TensorFlow

The Effects of Data Imbalance Under a Federated Learning Approach for Credit Risk Forecasting

An efficient credit card fraud detection approach using cost‐sensitive weak learner with imbalanced dataset

Evolutionary algorithms based on oversampling techniques for enhancing the imbalanced credit card fraud detection

Credit Card Fraud Detection: A Realistic Modeling and a Novel Learning Strategy

Performance Evaluation of Machine Learning Methods for Credit Card Fraud Detection Using SMOTE and AdaBoost

Credit Card-Not-Present Fraud Detection and Prevention Using Big Data Analytics Algorithms

Enhanced Credit Card Fraud Detection Model Using Machine Learning

Minimizing the Societal Cost of Credit Card Fraud with Limited and Imbalanced Data

A Deep Learning Ensemble With Data Resampling for Credit Card Fraud Detection

Learned lessons in credit card fraud detection from a practitioner perspective

Imbalanced credit card fraud detection data: A solution based on hybrid neural network and clustering-based undersampling technique

Hybrid Undersampling and Oversampling for Handling Imbalanced Credit Card Data