Word Embedding Based Document Similarity for the Inferring of Penalty.

Tieke He,Hao Lian,Zemin Qin,Zhipeng Zou,Bin Luo
DOI: https://doi.org/10.1007/978-3-030-02934-0_22
2018-01-01
Abstract:In this paper, we present a novel framework for the inferring of fine amount of judicial cases, which is based on word embedding when calculating the distances between documents. Our work is based on recent studies in word embeddings that learn semantically meaningful representations for words from local occurrences in sentences. This framework considers the context information of words by adopting the word2vec embedding, compared to traditional processing methods such as hierarchical clustering, kNN, k-means and traditional collaborative filtering that rely on vectors. In the area of judicial research, there exists the problem of deciding the amount of fine or penalty of legal cases, in this work we deal with it as a recommendation task, specifically, we divide all the legal cases into 7 classes by the amount of fine, and then for a target legal case, we try to infer which class this case belongs to. We conduct extensive experiments on a legal case dataset, and the results show that our proposed method outperforms all the comparative methods in metrics Precision, Recall and F1-Score.
What problem does this paper attempt to address?