Automatic Synonym Extraction Using Word2vec And Spectral Clustering

Li Zhang,Jun Li,Chao Wang
DOI: https://doi.org/10.23919/ChiCC.2017.8028251
2017-01-01
Abstract:Synonyms extraction is a fundamental research, which is helpful to text mining and information retrieval. In this paper, we propose method to extract synonymy from text, the method employs spectral clustering and word2vec. First, the word2vec model is trained by a large-scale English Wikipedia corpus. Then, we extract keywords from a text and use the trained model to generate similarities among these keywords. Since the word2vec model maps the relations of terms into a semantic space, the similarity of the terms is given by cosine similarity of the vectors. We construct the graph of these terms and its adjacency matrix. Finally, spectral clustering is used to cluster similar words. The experiment results show that this method has higher accuracy and recall scores compared with K-means.
What problem does this paper attempt to address?