Dependency-based Convolutional Neural Networks for Sentence Embedding

Mingbo Ma,Liang Huang,Bing Xiang,Bowen Zhou
DOI: https://doi.org/10.48550/arXiv.1507.01839
2015-08-03
Abstract:In sentence modeling and classification, convolutional neural network approaches have recently achieved state-of-the-art results, but all such efforts process word vectors sequentially and neglect long-distance dependencies. To exploit both deep learning and linguistic structures, we propose a tree-based convolutional neural network model which exploit various long-distance relationships between words. Our model improves the sequential baselines on all three sentiment and question classification tasks, and achieves the highest published accuracy on TREC.
Computation and Language,Artificial Intelligence,Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is that in sentence modeling and classification, although the existing Convolutional Neural Network (CNN) methods have achieved state - of - the - art results, they only consider sequential information when processing word vectors and ignore long - distance dependencies. The author proposes a dependency - based convolutional method (Dependency - based Convolutional Neural Networks, DCNNs), which utilizes n - grams in the tree structure rather than surface n - grams, thereby exploiting non - local interactions between words, aiming to combine deep learning with linguistic structures and improve the model's ability to capture long - distance dependencies. Specifically, the paper mainly focuses on the following aspects: 1. **Capturing long - distance dependencies**: Traditional CNNs usually only consider continuous word sequences when processing text data, ignoring the possible long - distance dependencies between words. Such dependencies are crucial for understanding the true meaning of sentences, especially in tasks such as sentiment analysis, subjectivity analysis, and question classification. 2. **Combining linguistic structures**: By introducing the dependency tree structure, the model can better capture the structural information within the sentence, especially those long - distance dependencies that have an important impact on the sentence sentiment or category. 3. **Improving classification performance**: The author hopes that the proposed DCNN model can outperform the existing sequential - based CNN models in multiple classification tasks, especially in sentiment analysis and question classification tasks. The paper verifies the effectiveness of the DCNN model through experiments, especially its performance on the TREC dataset is better than all previously published results, including those methods using a large number of hand - designed features.