Learning Sentences Similarity by Multi-Head Attention

Ming Yang Wang,Chen Jiang Li,Jian Dong Sun,Wei Ran Xu,Sheng Gao,Ya Hao Zhang,Pu Wang,Jun Liang Li
DOI: https://doi.org/10.1109/icnidc.2018.8525664
2018-01-01
Abstract:Text comprehension and information retrieval are two essential methods which could be reinforced by modeling to semantic similarity in sentences and phrases. However, there are general problems of traditional methods on LSTM which is used to process the input sentences. Those semantic vectors cannot fully represent the entire information sequence and the information contained in firstly input content will be diluted or overwritten by the late r information. The longer the input sequence, the more serious this phenomenon is. In order to address these problems, we propose new methods with self-attention. It can incorporate weights of special words and highlight the comparison of the similarity in key words. Compared with normal self-attention which can only incorporate the weight of the key words into the naive sentences and describe position information on sentences through position encoding. Our experiment shows that new method can improve the performance of model.
What problem does this paper attempt to address?