Abstract:Semantic textual similarity (\(\mathcal {STS}\)) seeks to assess the degree of semantic equivalence between two sentences or snippets of texts. Most methods of \(\mathcal {STS}\) are based on word surface and deem words as meaning unrelated symbols, which makes these methods indiscriminative for ubiquitous conceptual association among words. Recently, concept transferred space (CTS) is proposed to solve word conceptual association problem. It is generated from the noun concepts with their IS-A relations in WordNet. However, the CTS-based model can only calculate nouns; as a result, a large number of words, i.e., verbs, adjectives, adverbs as well as out-of-vocabulary named entities (OOV NEs), are neglected, thus resulting in information loss in the semantic similarity evaluation. This paper presents ways to solve this problem: To involve words other than nouns, derivational links in WordNet are employed to associate verbs, adjectives, and adverbs with their corresponding noun concepts; to prevent information loss by OOV NEs, the increased quantity of information of them is predicted according to the tendency learned from known NEs. Moreover, to further improve the accuracy of the CTS-based model, we take the importance of different types of words into consideration by assigning corresponding weights for them. Experimental results suggest that the proposed comprehensive CTS-based model achieves significant improvement compared with the primitive one without the non-nominal words, OOV NEs, and word weights and also outperforms all the yearly state-of-the-art systems at the *SEM/SemEval 2013–2016 \(\mathcal {STS}\) tasks. Additionally, at the SemEval 2017 \(\mathcal {STS}\) task, our team with the comprehensive CTS-based model ranked the second and the first among all teams and on Track 1 dataset, respectively.

Estimating Text Similarity based on Semantic Concept Embeddings

Measuring Distance-Based Semantic Similarity Using Meronymy and Hyponymy Relations

Mapping Sentences to Concept Transferred Space for Semantic Textual Similarity

Finding Semantic Equivalence of Text Using Random Index Vectors.

Description-Based Text Similarity

Discovering Latent Concepts and Exploiting Ontological Features for Semantic Text Search

Learning Semantic Representations for Novel Words: Leveraging Both Form and Context

Integrating Word Embeddings and Traditional NLP Features to Measure Textual Entailment and Semantic Relatedness of Sentence Pairs

Synergistic Union of Word Embedding and Knowledge Graph for Words Semantic Similarity Measure

Word Usage Similarity Estimation with Sentence Representations and Automatic Substitutes

An Efficient Approach for Measuring Semantic Similarity Combining WordNet and Wikipedia

Using a Chinese Lexicon to Learn Sense Embeddings and Measure Semantic Similarity.

Evaluation of taxonomic and neural embedding methods for calculating semantic similarity

A Large Probabilistic Semantic Network Based Approach to Compute Term Similarity

Improving Chinese Word Representation with Conceptual Semantics

Enhancing Semantic Word Representations by Embedding Deeper Word Relationships

ECNU: Using Traditional Similarity Measurements and Word Embedding for Semantic Textual Similarity Estimation.

Leveraging Conceptualization for Short-Text Embedding

sense2vec - A Fast and Accurate Method for Word Sense Disambiguation In Neural Word Embeddings

The MeSH-gram Neural Network Model: Extending Word Embedding Vectors with MeSH Concepts for UMLS Semantic Similarity and Relatedness in the Biomedical Domain

Distilling Semantic Concept Embeddings from Contrastively Fine-Tuned Language Models