Mind the Remainder: Taylor’s Theorem View on Recurrent Neural Networks

Xiang Guan,Yang Yang,Jingjing Li,Xing Xu,Heng Tao Shen
DOI: https://doi.org/10.1109/tnnls.2020.3042537
IF: 14.255
2022-04-01
IEEE Transactions on Neural Networks and Learning Systems
Abstract:Recurrent neural networks (RNNs) have gained tremendous popularity in almost every sequence modeling task. Despite the effort, these kinds of discrete unstructured data, such as texts, audio, and videos, are still difficult to be embedded in the feature space. Studies in improving the neural networks have accelerated since the introduction of more complex or deeper architectures. The improvements of previous methods are highly dependent on the model at the expense of huge computational sources. However, few of them pay attention to the algorithm. In this article, we bridge the Taylor series with the construction of RNN. Training RNN can be considered as a parameter estimate for the Taylor series. However, we found that there is a discrete term called the remainder in the finite Taylor series that cannot be optimized using gradient descent, which is part of the reason for the truncation error and the model falling into the local optimal solution. To address this, we propose a training algorithm that estimates the range of remainder and introduces the remainder obtained by sampling in this continuous space into the RNN to assist in optimizing the parameters. Notably, the performance of RNN can be improved without changing the RNN architecture in the testing phase. We demonstrate that our approach is able to achieve state-of-the-art performance in action recognition and cross-modal retrieval tasks.
computer science, artificial intelligence, theory & methods,engineering, electrical & electronic, hardware & architecture
What problem does this paper attempt to address?