Global Confidence Degree Based Graph Neural Network for Financial Fraud Detection

Jiaxun Liu,Yue Tian,Guanjun Liu
2024-08-18
Abstract:Graph Neural Networks (GNNs) are widely used in financial fraud detection due to their excellent ability on handling graph-structured financial data and modeling multilayer connections by aggregating information of neighbors. However, these GNN-based methods focus on extracting neighbor-level information but neglect a global perspective. This paper presents the concept and calculation formula of Global Confidence Degree (GCD) and thus designs GCD-based GNN (GCD-GNN) that can address the challenges of camouflage in fraudulent activities and thus can capture more global information. To obtain a precise GCD for each node, we use a multilayer perceptron to transform features and then the new features and the corresponding prototype are used to eliminate unnecessary information. The GCD of a node evaluates the typicality of the node and thus we can leverage GCD to generate attention values for message aggregation. This process is carried out through both the original GCD and its inverse, allowing us to capture both the typical neighbors with high GCD and the atypical ones with low GCD. Extensive experiments on two public datasets demonstrate that GCD-GNN outperforms state-of-the-art baselines, highlighting the effectiveness of GCD. We also design a lightweight GCD-GNN (GCD-GNN$_{light}$) that also outperforms the baselines but is slightly weaker than GCD-GNN on fraud detection performance. However, GCD-GNN$_{light}$ obviously outperforms GCD-GNN on convergence and inference speed.
Machine Learning
What problem does this paper attempt to address?
This paper attempts to solve two main problems in financial fraud detection: 1. **Complex relationships**: Financial fraud activities usually involve complex relationships between entities, and it is difficult to directly identify these relationships based only on the direct connections between entities. 2. **Camouflaged activities**: Fraudsters often adopt strategies to conceal their fraudulent behaviors, which makes detection more difficult. To address these problems, the paper proposes a graph neural network (GNN) method based on global confidence degree (GCD), called GCD - GNN. This method improves the traditional GNN method in the following ways: - **Global perspective**: Traditional GNN methods mainly focus on the extraction of local neighbor information and ignore the global perspective. GCD - GNN can capture more information from a global perspective by introducing global confidence degree, so as to better deal with camouflaged activities. - **Feature optimization**: Transform the original features through a multi - layer perceptron (MLP) to generate new features, and combine with prototypes to eliminate unnecessary information and improve the separation between fraud nodes and benign nodes. - **Message aggregation**: Use GCD to generate attention weights and perform message aggregation from both typical and atypical perspectives, so as to extract information more effectively. Specifically, the main contributions of GCD - GNN include: - **Generate better prototypes**: By transforming features, better prototypes are generated. These new features can eliminate unnecessary information and increase the separation between fraud nodes and benign nodes. - **Utilize global confidence degree**: Calculate the GCD of each node to extract information from a global perspective, providing a new perspective for observing fraud patterns, ensuring model performance and significantly improving the convergence speed. - **Aggregation of typical and atypical information**: Aggregate messages from both typical and atypical perspectives respectively, enrich the message sources, remove interference information, and directly improve the model performance. The paper has carried out extensive experiments on two public datasets. The results show that GCD - GNN outperforms the existing state - of - the - art methods in multiple indicators. In addition, the paper also provides a lightweight version of GCD - GNN, which has a faster training and inference speed while maintaining good performance.