A Sememe Prediction Method Based on the Central Word of a Semantic Field

Guanran Luo,Yunpeng Cui
DOI: https://doi.org/10.3390/electronics13020413
IF: 2.9
2024-01-20
Electronics
Abstract:A "sememe" is an indivisible minimal unit of meaning in linguistics. Manually annotating sememes in words requires a significant amount of time, so automated sememe prediction is often used to improve efficiency. Semantic fields serve as crucial mediators connecting the semantics between words. This paper proposes an unsupervised method for sememe prediction based on the common semantics between words and semantic fields. In comparison to methods based on word vectors, this approach demonstrates a superior ability to align the semantics of words and sememes. We construct various types of semantic fields through ChatGPT and design a semantic field selection strategy to adapt to different scenario requirements. Subsequently, following the order of word–sense–sememe, we decompose the process of calculating the semantic sememe similarity between semantic fields and target words. Finally, we select the word with the highest average semantic sememe similarity as the central word of the semantic field, using its semantic primes as the predicted result. On the BabelSememe dataset constructed based on the sememe knowledge base HowNet, the method of semantic field central word (SFCW) achieved the best results for both unstructured and structured sememe prediction tasks, demonstrating the effectiveness of this approach. Additionally, we conducted qualitative and quantitative analyses on the sememe structure of the central word.
engineering, electrical & electronic,computer science, information systems,physics, applied
What problem does this paper attempt to address?
The paper aims to address the problem of automatically predicting sememes to improve the efficiency of manually annotating sememes. Specifically, the paper proposes an unsupervised method based on Semantic Field Center Words (SFCW) to predict sememes for new words. Compared to methods based on word vectors, this approach performs better in semantic alignment between words and sememes. The main contributions of the paper are as follows: 1. Constructed three types of semantic fields and designed selection strategies to expand the application scope of semantic fields and improve the accuracy of sememes. 2. Achieved better alignment between words and sememes by predicting through the calculation of center words in the semantic field, without the need for vector training. 3. On the publicly available BabelSememe dataset, the proposed SFCW method achieved the best results in both unstructured and structured sememe prediction tasks, and conducted qualitative and quantitative analysis of the sememe structure of center words.