Multiway Dynamic Mask Attention Networks for Natural Language Inference.

Jingfan Tang,Xinqiang Wu,Min Zhang,Xiujie Zhang,Ming Jiang
DOI: https://doi.org/10.3233/jcm-204451
2020-01-01
Journal of Computational Methods in Sciences and Engineering
Abstract:Attention mechanisms are widely used on NLP tasks and show strong performance in modeling local/global dependencies. Directional self-attention network shows the competitive performance on various datasets, but it not considers the reverse information of a sentence. In this paper, we propose the Multiway Dynamic Mask attention Network (MDMAN). The model has two modules: a dynamic mask selector and a multi-attention encoder. The dynamic mask selector chooses high-quality reverse information with reinforcement learning and feeds reverse information to multi-attention encoder, the multi-attention encoder uses four attention functions to match the word in the same sentence at different token level, then combine the information from all functions to obtain the final representation. Our experiments performed on two publicly available NLI datasets show that MDMAN achieves significant improvement over DSAN.
What problem does this paper attempt to address?