Identification of Topics from Scientific Papers through Topic Modeling

Denis Luiz Marcello Owa
DOI: https://doi.org/10.4236/OJAPPS.2021.104038
Open Journal of Applied Sciences
Abstract:Topic modeling is a probabilistic model that identifies topics covered in text(s). In this paper, topics were loaded from two implementations of topic modeling, namely, Latent Semantic Indexing (LSI) and Latent Dirichlet Allocation (LDA). This analysis was performed in a corpus of 1000 academic papers written in English, obtained from PLOS ONE website, in the areas of Biology, Medicine, Physics and Social Sciences. The objective is to verify if the four academic fields were represented in the four topics obtained by topic modeling. The four topics obtained from Latent Semantic Indexing (LSI) and Latent Dirichlet Allocation (LDA) did not represent the four academic fields.
Computer Science,Physics
What problem does this paper attempt to address?