Hierarchical Parameter Sharing In Recursive Neural Networks With Long Short-Term Memory

Fengyu Li,Mingmin Chi,Dong Wu,Junyu Niu
DOI: https://doi.org/10.1007/978-3-319-70096-0_60
2017-01-01
Abstract:Parameter Sharing (or weight sharing) is widely used in Neural Networks, such as Recursive Neural Networks (RvNNs) and its variants, to control model complexities and extract prior knowledge. The parameter sharing in RvNNs for language model assumes that non-leaf nodes in treebanks are generated by similar semantic compositionality, where hidden units of all the non-leaf nodes in RvNNs share model parameters. However, treebanks have several semantic levels with significantly different semantic compositionality. Accordingly, this leads to a poor classification performance if nodes in high semantic levels share the same parameters with those in low levels. In the paper, a novel parameter sharing strategy in a hierarchical manner is proposed over Long Short-Term Memory (LSTM) cells in Recursive Neural Networks, denoted as shLSTM-RvNN, in which weight connections in hidden units are clustered according to hierarchical semantic levels defined in Penn Treebank tagsets. Accordingly, the parameters in the same semantic level can be shared but those in different semantic levels should have different sets of connections weights. The proposed shLSTM-RvNN model is evaluated in benchmark data sets containing semantic compositionality. Empirical results show that the shLSTM-RvNN model increases classification accuracies but significantly reduces time complexities.
What problem does this paper attempt to address?