An End-to-end Joint Model for Evidence Information Extraction from Court Record Document.

Donghong Ji,Peng Tao,Hao Fei,Yafeng Ren
DOI: https://doi.org/10.1016/j.ipm.2020.102305
IF: 7.466
2020-01-01
Information Processing & Management
Abstract:Information extraction is one of the important tasks in the field of Natural Language Processing (NLP). Most of the existing methods focus on general texts and little attention is paid to information extraction in specialized domains such as legal texts. This paper explores the task of information extraction in the legal field, which aims to extract evidence information from court record documents (CRDs). In the general domain, entities and relations are mostly words and phrases, indicating that they do not span multiple sentences. In contrast, evidence information in CRDs may span multiple sentences, while existing models cannot handle this situation. To address this issue, we first add a classification task in addition to the extraction task. We then formulate the two tasks as a multi-task learning problem and present a novel end-to-end model to jointly address the two tasks. The joint model adopts a shared encoder followed by separate decoders for the two tasks. The experimental results on the dataset show the effectiveness of the proposed model, which can obtain 72.36% F1 score, outperforming previous methods and strong baselines by a large margin.
What problem does this paper attempt to address?