Crime risk analysis through big data algorithm with urban metrics
Jia Wang,Jun Hu,Shifei Shen,Jun Zhuang,Shunjiang Ni
DOI: https://doi.org/10.1016/j.physa.2019.123627
2020-01-01
Abstract:Crime is pervasive all around the world. Understanding the influence of social features on crime occurrences of a city is a hot topic among researchers. Correlations between crime and other social characteristics have been studied by large amounts of statistical models, including Ordinary Least Square (OLS) linear regression model, Random Forest (RF) regression model, Artificial Neural Network (ANN) model and so on. However, results of these studies, such as the prediction accuracy, are not satisfying and many contradictory conclusions are achieved in previous research works. These controversies are triggered by several factors, including the non-Gaussian distributions and multicollinearity of urban social data, inaccuracy and inadequacy of the processed data, etc. To fill these gaps, we analyzed the influence of 18 urban indicators within 6 categories including geography, economy, education, housing, urbanization and population structure on crime risk in China's major prefecture-level cities by year. We used the big data algorithm, Least Absolute Shrinkage and Selection Operator (LASSO) and Extremely-randomized Trees (Extra-Trees), to predict the crime risk and quantify the influence of urban parameters on crime. 83% of accuracy on crime risk prediction can be obtained from our fitted model and the importance of urban indicators is ranked. Results show that area of land used for living, number of subscribers of mobile telephone, employed population are the three main factors on the crime occurrences in China. Our research makes contributions to better understanding of the effects of urban indicators on crime in a socialist nation, and providing instructions and strategies for crime prediction and crime rate control with governments, in this big-data era. (C) 2019 Elsevier B.V. All rights reserved.