Latent Dirichlet Allocation - An approach for topic discovery

Astha Goyal,Indu Kashyap
DOI: https://doi.org/10.1109/com-it-con54601.2022.9850912
2022-05-26
Abstract:The digital age has brought about an increased data generation and, therefore, challenges to process that data. Machine learning and NLP algorithms have enabled smooth data processing as the research progressed technologically. Machine learning algorithms can be classified as supervised, unsupervised, and reinforcement learning. Topic Modeling is one such algorithm that follows unsupervised machine learning techniques. Mining text to discover the hidden semantic structure; topics in a text body is a typical topic modeling application. Numerous techniques fall under topic modeling including Latent Semantic Analysis (LSA), Probabilistic Latent Semantic Analysis (PLSA), Latent Dirichlet Allocation (LDA), Non-Negative Matrix Factorization (NMF). Latent Dirichlet Allocation (LDA) is considered the most prevalent topic modeling method and has substantially evolved from where it had begun. LDA now has transformed into many variants such as Hierarchical LDA Model (hLDA), Dynamic Topic Model (DTM), Correlated Topic Model (CTM), Pachinko Allocation Topic Model (PAM), and Author Topic Model. Through this paper, LDA, its advancements, and its applications are being assessed and analyzed.
Computer Science
What problem does this paper attempt to address?