Data Mining in Customs Risk Detection With Cost-Sensitive Classification

Xin Zhou
DOI: https://doi.org/10.55596/001c.116219
2019-09-30
Abstract:To improve the efficiency and accuracy of risk management in Customs, this paper explores the data mining process for risk detection with decision tree and boosting algorithms. The data are characterised by high dimensionality, imbalance and cost sensitivity. In particular, misjudging a false declaration as truthful can be more harmful than misjudging a truthful declaration as false. Therefore, considering the different costs of misclassification, we suggest taking a cost-sensitive approach with cost matrix in data mining. The inspection results are set as the prediction target variable to train the classifiers and make predictions. A data mining model of binary classification is formulated after feature selection and rebalancing. We evaluate its performance with classic measures of classification and customs risk assessment. The results show that the performance has been significantly improved with boosting while the output is less sensitive to cost-ratio under boosting.
What problem does this paper attempt to address?