Abstract:The application of artificial intelligence in the legal domain has received significant attention from legal professionals and AI researchers in recent years. The intelligent judge system has made remarkable progress due to advancements in natural language processing, particularly deep learning. Matching similar cases has enormous potential with significant implications for the legal domain. Matching and analyzing similar cases helps legal professionals make more reasonable judgments, ensuring fairness, consistency, and accuracy in law applications. The existing methods did not fully use representation-based and interaction-based text matching in the feature extraction. This paper presents an innovative approach that employs ensemble learning with multiple models to enhance the prediction of legal case similarity. The method comprises two sub-networks: a similarity representation sub-network and a binary classification judgment sub-network. The similarity representation sub-network is trained using contrastive learning, focusing on semanticizing the similarity between sample features to distinguish between dissimilar samples and reduce the distance between similar ones. Furthermore, the binary classification judgment sub-network integrates sample pairs to facilitate feature interaction between text pairs during extraction. The training of these two sub-networks employs different information processing and optimization loss, which allows ensemble learning to capitalize on the strengths of both models and significantly improve the accuracy of predicting the similarity of legal cases. The accuracy of our method on the test set is 74.53%, outperforming other existing methods on the public dataset CAIL2019-SCM.

NLPCC 2016 Shared Task Chinese Words Similarity Measure via Ensemble Learning based on Multiple Resources

Overview Of The Nlpcc-Iccpol 2016 Shared Task: Chinese Word Similarity Measurement

ECNU: Using Traditional Similarity Measurements and Word Embedding for Semantic Textual Similarity Estimation.

Measuring Chinese-English Cross-Lingual Word Similarity With Hownet And Parallel Corpus

ECNU: Leveraging on Ensemble of Heterogeneous Features and Information Enrichment for Cross Level Semantic Similarity Estimation

Modeling multi-prototype Chinese word representation learning for word similarity

A Similarity Algorithm Based on the Generality and Individuality of Words

Application-Oriented Comparison and Evaluation of Six Semantic Similarity Measures Based on Wordnet

Chinese Word Similarity Computing Based on Combination Strategy

ECNU at SemEval-2017 Task 1: Leverage Kernel-based Traditional NLP Features and Neural Networks to Build a Universal Model for Multilingual and Cross-lingual Semantic Textual Similarity

Legal Document Similarity Matching Based on Ensemble Learning

English-Chinese Cross Language Word Embedding Similarity Calculation

Enhancing Embedding-Based Chinese Word Similarity Evaluation with Concepts and Synonyms Knowledge

SemEval-2012 Task 4: Evaluating Chinese Word Similarity.

Relational Similarity Measurement between Word-pairs Using Multi-Task Lasso

Word Similarity Computation Based on Wordnet and Hownet

Knowledge-Enhanced Ensemble Learning for Word Embeddings

Ensemble Similarity Measure for Community-Based Question Answer

Chinese Lexical Semantic Similarity Computing Based on Large-scale Corpus

Research on Hownet-Based Chinese Word Lexical Semantic Similarity Measurement