Highway II, an Extended Version of Highway Networks and Its Application to Densely Connected Bi-LSTM

Jie Yang,Shujuan Yu,Yun Zhang
DOI: https://doi.org/10.3233/jifs-190191
2019-01-01
Journal of Intelligent & Fuzzy Systems
Abstract:The increase of depth is essential for the success of Deep Neural Networks while also leads to the difficulty of training. In light of this, the authors propose a novel multi-layer LSTM model called Highway-DC via introducing Highway Networks (Highway) to Densely Connected Bi-LSTM (DC-Bi-LSTM) which representation of each layer concatenates the output of itself and all preceding layers. Highway is applied to control the volume of input or output of each layer in DC-Bi-LSTM to the next. However, results reveal that Highway-DC shows no improvement over DC-Bi-LSTM, thus an extended version of Highway named Highway II is proposed via eliminating the multiplicative connections between transform gate and the output in Highway thus preserve the learning of each layer. And the Highway II-based model is named Highway II-DC. Evaluated on 7 benchmark datasets of text classification with compare to DC-Bi-LSTM and other state-of-the-art approaches, results indicate that Highway II-DC shows promising performance for achieving state-of-the-art on 3 datasets and surpassing DC-Bi-LSTM on 6 datasets with faster speed to converge. Besides, it can still enjoy the gain of increased layers with depth up to 30, while DC-Bi-LSTM gets saturated early at a depth of 15.
What problem does this paper attempt to address?