Performance evaluation of Latent Dirichlet Allocation on legal documents

A. O. Ogundare,A. U. Saleh,O. A. James,E. E. Ajayi,S. Gostoji
DOI: https://doi.org/10.54254/2755-2721/52/20241322
2024-03-27
Abstract:Latent Dirichlet Allocation (LDA) is an algorithm with the capability of processing large amount of text data. In this study, the LDA is used to produce topic modelling of topic clusters from corpus of legal texts generated under 4 topics within Nigeria context Employment Contract, Election Petition, Deeds, and Articles of Incorporation. Each topic has a substantial number of articles and the LDA method proves effective in extracting topics and generating index words that are in each topic cluster. At the end of experimentation, results are compared with manually pre-annotated dataset for validation purpose and the results show high accuracy. The LDA output shows optimal performance in the word indexing processing for Election Petition as all the documents annotated under the topic were accurately classified.
What problem does this paper attempt to address?