Dependency-based syntax-aware word representations
Meishan Zhang,Zhenghua Li,Guohong Fu,Min Zhang
DOI: https://doi.org/10.1016/j.artint.2020.103427
IF: 14.4
2021-01-01
Artificial Intelligence
Abstract:Dependency syntax has been demonstrated highly useful for a number of natural language processing (NLP) tasks. Typical approaches of utilizing dependency syntax include TreeRNN and Tree-Linearization, both of which exploit explicit 1-best tree outputs from a welltrained parser as inputs. However, these approaches may suffer from error propagation due to the inevitable errors contained in the 1-best tree outputs. In this work, we propose a novel approach to integrate dependency syntax without using the discrete tree outputs. The key idea is to use the intermediate hidden representations of a welltrained encoder-decoder dependency parser, which are referred to as Dependency-based Syntax-Aware Word Representations (Dep-SAWRs). Then, we simply concatenate such DepSAWRs with the conventional context-insensitive word embeddings to compose input word representations, without requiring to modify the model architecture of the downstream tasks. We evaluate the proposed method on four kinds of typical NLP tasks, including sentence classification, sentence matching, sequence labeling and machine translation. Experimental results show that the proposed approach is highly promising. On the one hand, it can utilize dependency syntax effectively, bringing consistently better performance on the four tasks compared with baselines without using syntax. On the other hand, the proposed method can outperform the Tree-RNN and Tree-Linearization approaches in most settings, and meanwhile are highly efficient in syntax integration. In addition, the proposed method would be easily extendable to encoding other structural attributes of language. (C) 2020 Elsevier B.V. All rights reserved.