Xigmoid: an Approach to Improve the Gating Mechanism of RNN

Jingke She,Shanshan Gong,Suyuan Yang,Hantao Yang,Shaofei Lu
DOI: https://doi.org/10.1109/ijcnn55064.2022.9892346
2022-01-01
Abstract:This work proposes an innovative approach for the gating mechanism of RNN class models. A transfer function is embedded into the original sigmoid to form a new gate function called xigmoid. The purpose is to alleviate the gradient amplification problem when the models are trying to learn features at far end of a long time series. Using the xigmoid function, original LSTM and GRU are converted to xLSTM and xGRU, respectively. The initialization method for the trainable parameters of xigmoid is also derived and discussed as a necessary support to the new method. Verification experiments are conducted for xLSTM and xGRU against several baseline models, showing both faster convergency/training and better accuracy of the proposed xigmoid-based models. The code and datasets are available at https://github.com/privateos/xigmoid
What problem does this paper attempt to address?