An Empirical Study of Linear Dimensionality Reduction for Judicial Predictive Models

Zhenyu Liu,Huanhuan Chen
DOI: https://doi.org/10.1109/icist.2018.8426121
2018-01-01
Abstract:Judicial cases can be modeled with the textual frequency vectors under the Bag-of-Words assumption to predict the decision outcome. However, such models are often with much more numbers of features than training samples, which usually leads to the over fitting problem. In this paper, we conduct an empirical investigation on linear dimensionality reduction of the high-dimensional judicial predictive models via the wide spread principal component analysis approach. The experimental results show that these high-dimensional models do not suffer from the overfitting problem, but the under fitting problem. Moreover, the higher-order dependency in the textual frequency data cannot be decorrelated by the linear dimensionality reduction approach, which restrains the performance of judicial classification models subject to the unchanged level of signal-noise ratio in the derived low-dimensional features.
What problem does this paper attempt to address?