Abstract:Sentence boundary detection (SBD) represents an important first step in natural language processing since accurately identifying sentence boundaries significantly impacts downstream applications. Nevertheless, detecting sentence boundaries within legal texts poses a unique and challenging problem due to their distinct structural and linguistic features. Our approach utilizes deep learning models to leverage delimiter and surrounding context information as input, enabling precise detection of sentence boundaries in English legal texts. We evaluate various deep learning models, including domain-specific transformer models like LegalBERT and CaseLawBERT. To assess the efficacy of our deep learning models, we compare them with a state-of-the-art domain-specific statistical conditional random field (CRF) model. After considering model size, F1-score, and inference time, we identify the Convolutional Neural Network Model (CNN) as the top-performing deep learning model. To further enhance performance, we integrate the features of the CNN model into the subsequent CRF model, creating a hybrid architecture that combines the strengths of both models. Our experiments demonstrate that the hybrid model outperforms the baseline model, achieving a 4% improvement in the F1-score. Additional experiments showcase the superiority of the hybrid model over SBD open-source libraries when confronted with an out-of-domain test set. These findings underscore the importance of efficient SBD in legal texts and emphasize the advantages of employing deep learning models and hybrid architectures to achieve optimal performance.

Ontology Enhanced Claim Detection

Model Semantic Relations with Extended Attributes

Claim Extraction in Biomedical Publications using Deep Discourse Model and Transfer Learning

Exploiting Ontological Reasoning In Argumentation Based Multi-Agent Collaborative Classification

Ontology-driven Event Type Classification in Images

Multilingual and Multi-topical Benchmark of Fine-tuned Language models and Large Language Models for Check-Worthy Claim Detection

Ontology Completion with Natural Language Inference and Concept Embeddings: An Analysis

OntoED: Low-resource Event Detection with Ontology Embedding

Sensorimotor Enhanced Neural Network for Metaphor Detection.

Towards Ontology-Enhanced Representation Learning for Large Language Models

Robust Claim Verification Through Fact Detection

Enhancing Legal Argument Mining with Domain Pre-training and Neural Networks

Knowledge-Augmented Language Models for Cause-Effect Relation Classification

Learning And Applying Ontology For Machine Learning In Cyber Attack Detection

UoB at SemEval-2020 Task 12: Boosting BERT with Corpus Level Information

Enhancing Geometric Ontology Embeddings for $\mathcal{EL}^{++}$ with Negative Sampling and Deductive Closure Filtering

Ontology extension by online clustering with large language model agents

Ontological Relations from Word Embeddings

Legal sentence boundary detection using hybrid deep learning and statistical models

Inductive Relation Inference of Knowledge Graph Enhanced by Ontology Information

Enhancing Explainability in Multimodal Large Language Models Using Ontological Context