A Semantic Similarity Analysis Method for English Texts Based on Prior Information

Nanxiao Deng,Ya Zhou,Yabing Wang,Guimin Huang,Yiqun Li,Jingru Chen
DOI: https://doi.org/10.21203/rs.3.rs-4183347/v1
2024-01-01
Abstract:Abstract With the rapid development of Internet technology, a massive amount of English text has emerged, creating an urgent need to evaluate the semantic similarity of these texts across different domains. Existing research on semantic similarity faces the challenge of insufficient prior knowledge. Therefore, a new method has been proposed, called Knowledge Prior Topic Model (KPTM), for analyzing text semantic similarity. This method incorporates semantic relationship data from the large-scale English knowledge base, Probase, into the topic model and combines it with an effective keyword serialization approach to calculate the hybrid measurement of concepts and keywords for text similarity. Experimental results demonstrate that the proposed method contributes to improving semantic similarity and outperforms other models in comparative experiments.
What problem does this paper attempt to address?