Can Deep Learning Large Language Models be Used to Unravel Knowledge Graph Creation?

Sydney Anuyah,Sunandan Chakraborty
DOI: https://doi.org/10.1145/3661725.3661733
2024-04-12
Abstract:This research focuses on advancing RE methodologies by employing and comparing various NLP models for analyzing medical relationships, particularly concerning Gastroesophageal Reflux Disease (GERD). Leveraging a comprehensive dataset of GERD-related articles from PubMed, the study explores the effectiveness of SpaCy for Named Entity Recognition (NER) and BERT-based models (including Bio-BERT and ELECTRA) for tokenization and deep learning classification tasks. Unique to this study is the extensive comparison across multiple advanced models, providing an insightful evaluation of their performance in terms of precision, recall, F1-score, and accuracy in the context of biomedical text analysis. Significantly, Bio-BERT emerged as the most effective model for this dataset, excelling across all metrics compared to BERT-BASE and ELECTRA. This performance underscores Bio-BERT’s specialized pre-training on biomedical literature. The analysis includes the application of these models in constructing a comprehensive knowledge graph, which consolidates diverse information about GERD. Additionally, the paper presents a critical comparison between SpaCy’s automated annotation and human annotators, utilizing the F-1 score for assessing the reliability of BERT’s RE capabilities.
Computer Science,Medicine
What problem does this paper attempt to address?