Threshold-Based Retrieval and Textual Entailment Detection on Legal Bar Exam Questions

Sabine Wehnert,Sayed Anisul Hoque,Wolfram Fenske,Gunter Saake
DOI: https://doi.org/10.48550/arXiv.1905.13350
2019-05-31
Abstract:Getting an overview over the legal domain has become challenging, especially in a broad, international context. Legal question answering systems have the potential to alleviate this task by automatically retrieving relevant legal texts for a specific statement and checking whether the meaning of the statement can be inferred from the found documents. We investigate a combination of the BM25 scoring method of Elasticsearch with word embeddings trained on English translations of the German and Japanese civil law. For this, we define criteria which select a dynamic number of relevant documents according to threshold scores. Exploiting two deep learning classifiers and their respective prediction bias with a threshold-based answer inclusion criterion has shown to be beneficial for the textual entailment task, when compared to the baseline.
Information Retrieval,Computation and Language,Machine Learning
What problem does this paper attempt to address?