Recidivism early warning model based on rough sets and the improved K-prototype clustering algorithm and a back propagation neural network

Kangshun Li,Ziming Wang,Xin Yao,Jiahao Liu,Hongming Fang,Yishu Lei
DOI: https://doi.org/10.1007/s12652-021-03337-z
IF: 3.662
2021-06-12
Journal of Ambient Intelligence and Humanized Computing
Abstract:The rate of recidivism by criminals after their release from prison is high, which is harmful to society. Thus, it is socially significant to reduce their recidivism rate. This article uses public data from the state of Iowa in the United States. According to the data characteristics, such as having redundant samples and mixed attributes, we propose the following methods. First, we use a rough set attribute reduction algorithm based on probability distributions to reduce the redundant items. Second, the sample data are clustered with an improved clustering algorithm. Based on the traditional K-prototype clustering algorithm, the clustering algorithm is improved by changing the measurement method of the categorical attributes, changing the initial cluster center selection method, and weighting the attributes based on the information entropy. The clustering experiment results show that the improved clustering algorithm has a better clustering effect and higher clustering accuracy than the traditional K-prototype clustering algorithm. Finally, a back propagation neural network is used to predict the recidivism probability of the sample processed by the above algorithm. The final experimental results show that the two redundant attributes are successfully reduced by rough sets, which greatly reduces the run time of the model. Compared with the traditional K-prototype clustering algorithm, the improved K-prototype clustering algorithm proposed in this paper has a better effect on the various indicators and objective function. Finally, through neural network prediction, the prediction accuracy of this model reached 87.9%. At the same time, a large number of experiments on benchmark datasets verify the effectiveness of our proposed model.
computer science, information systems,telecommunications, artificial intelligence
What problem does this paper attempt to address?
The main problem this paper attempts to address is reducing the recidivism rate of criminals after their release from prison. Specifically, the authors use public data from Iowa, USA, and propose the following methods to address the characteristics of redundant samples and mixed attributes in the data: 1. **Attribute Reduction**: First, a rough set attribute reduction algorithm based on probability distribution is used to reduce redundant items. 2. **Clustering Algorithm Improvement**: Second, an improved K-prototype clustering algorithm is used to cluster the sample data. The improved clustering algorithm enhances clustering effectiveness and accuracy by changing the distance measurement method for categorical attributes, the selection method for initial cluster centers, and attribute weighting based on information entropy. 3. **Neural Network Prediction**: Finally, a backpropagation neural network (BP neural network) is used to predict the recidivism probability of the samples processed by the above algorithms. Through these methods, the authors aim to improve the accuracy and efficiency of recidivism prediction, thereby providing technical support to society to help reduce the recidivism rate and build a harmonious society. Experimental results show that the prediction accuracy of this model reached 87.9%.