Abstract:<p>The continuous growth of scientific literature brings innovations and, at the same time, raises new challenges. One of them is related to the fact that its analysis has become difficult due to the high volume of published papers for which manual effort for annotations and management is required. Novel technological infrastructures are needed to help researchers, research policy makers, and companies to time-efficiently browse, analyse, and forecast scientific research. Knowledge graphs i.e., large networks of entities and relationships, have proved to be effective solution in this space. Scientific knowledge graphs focus on the scholarly domain and typically contain metadata describing research publications such as authors, venues, organizations, research topics, and citations. However, the current generation of knowledge graphs lacks of an explicit representation of the knowledge presented in the research papers. As such, in this paper, we present a new architecture that takes advantage of Natural Language Processing and Machine Learning methods for extracting entities and relationships from research publications and integrates them in a large-scale knowledge graph. Within this research work, we (i) tackle the challenge of knowledge extraction by employing several state-of-the-art Natural Language Processing and Text Mining tools, (ii) describe an approach for integrating entities and relationships generated by these tools, (iii) show the advantage of such an hybrid system over alternative approaches, and (vi) as a chosen use case, we generated a scientific knowledge graph including 109,105 triples, extracted from 26,827 abstracts of papers within the <em>Semantic Web</em> domain. As our approach is general and can be applied to any domain, we expect that it can facilitate the management, analysis, dissemination, and processing of scientific knowledge.</p>

Learning from syntax generalizations for automatic semantic annotation

An Ontology-Based Automatic Semantic Annotation Approach for Patent Document Retrieval in Product Innovation Design

Learning Latent Semantic Annotations for Grounding Natural Language to Structured Data

Towards Compositionally Generalizable Semantic Parsing in Large Language Models: A Survey

Automatic Semantic Annotation Using Machine Learning

A fully automated approach to a complete Semantic Table Interpretation

Understanding the Logical and Semantic Structure of Large Documents

Modeling Legal Reasoning: LM Annotation at the Edge of Human Agreement

Sentence Embeddings and High-speed Similarity Search for Fast Computer Assisted Annotation of Legal Documents

SALKG: A Semantic Annotation System for Building a High-quality Legal Knowledge Graph

Semantic Segmentation of Legal Documents via Rhetorical Roles

SeMi: A SEmantic Modeling machIne to build Knowledge Graphs with graph neural networks

A case study for automated attribute extraction from legal documents using large language models

Data Extraction via Semantic Regular Expression Synthesis

Search-Based Automatic Web Image Annotation Using Latent Visual and Semantic Analysis

Generating knowledge graphs by employing Natural Language Processing and Machine Learning techniques within the scholarly domain

Semantic Parsing for English as a Second Language

LLM Based Multi-Agent Generation of Semi-structured Documents from Semantic Templates in the Public Administration Domain

Unsupervised Law Article Mining based on Deep Pre-Trained Language Representation Models with Application to the Italian Civil Code

Legal information retrieval for understanding statutory terms

A Semi-automated Ontology Construction for Legal Question Answering