Concept Graphs: A Novel Approach for Textual Analysis of Medical Documents

Franz Matthies,Christoph Beger,Ralph Schäfermeier,Alexandr Uciteli
DOI: https://doi.org/10.3233/SHTI230710
2023-09-12
Abstract:The task of automatically analyzing the textual content of documents faces a number of challenges in general but even more so when dealing with the medical domain. Here, we can't normally rely on specifically pre-trained NLP models or even, due to data privacy reasons, (massive) amounts of training material to generate said models. We, therefore, propose a method that utilizes general-purpose basic text analysis components and state-of-the-art transformer models to represent a corpus of documents as multiple graphs, wherein important conceptually related phrases from documents constitute the nodes and their semantic relation form the edges. This method could serve as a basis for several explorative procedures and is able to draw on a plethora of publicly available resources. We test it by comparing the effectiveness of these so-called Concept Graphs with another recently suggested approach for a common use case in information retrieval, document clustering.
What problem does this paper attempt to address?