An enrichment multi-layer Arabic text classification model based on siblings patterns extraction

Amira M. Idrees,Abdul Lateef Marzouq Al-Solami
DOI: https://doi.org/10.1007/s00521-023-09405-z
2024-03-15
Neural Computing and Applications
Abstract:Ontologies extraction is the cornerstone for a meaningful knowledge representation. Ontologies represent the semantic relations repository in a readable format with a clear representation of the domain knowledge. This made the automated ontologies construction a promising research objective with a direct and clear impact in many related fields, including knowledge base systems, text classification, etc. In this research, a workflow is set up for successful ontology learning from Arabic textual data. One of the bottlenecks for the text analytics field is the continuous requirement of up-to-date resources such as lexicons. This challenge is one of the main focuses of the current research, which proposes an automated ontology extraction method with no use of pre-defined resources. The research proposes a novel generic ontology learning and document classification model based on no utilization of prior text analysis resources. Moreover, a self-enrichment approach is proposed to ensure continuous knowledge construction. The research extends the ontology learning process to include the ontologies’ semantic relationships, targeting a higher level of extraction and model enrichment. Two experiments have been applied with two different datasets that belong to different fields to ensure the generality of the proposed model. The results of the two experiments confirmed the high accuracy of the proposed model and its positive contribution to the classification task. The results of the ontology learning task reached 95%, while the classification task revealed the advancement of the Bagging algorithm over other machine learning algorithms with an accuracy equal to 97.92%.
computer science, artificial intelligence
What problem does this paper attempt to address?