Resformer: Combine quadratic linear transformation with efficient sparse Transformer for long-term series forecasting

Gongguan Chen,Hua Wang,Yepeng Liu,Mingli Zhang,Fan Zhang
DOI: https://doi.org/10.3233/ida-227006
IF: 1.7
2023-11-08
Intelligent Data Analysis
Abstract:With the continuous development of deep learning, long sequence time-series forecasting (LSTF) has attracted more and more attention in power consumption prediction, traffic prediction and stock prediction. In recent studies, various improved models of Transformer are favored. While these models have made breakthroughs in reducing the time and space complexity of Transformer, there are still some problems, such as the predictive power of the improved model being slightly lower than that of Transformer. And these models ignore the importance of special values in the time series. To solve these problems, we designed a more concise network named Resformer, which has four significant characteristics: (1) The fully sparse self-attention mechanism achieves O⁢(LlogL) time complexity. (2) The AMS module is used to process the special values of time series and has comparable performance on sequences dependency alignment. (3) Using quadratic linear transformation, a simple LT module is designed to replace the self-attention mechanism. It effectively reduces redundant information. (4) The DistPooling method based on data distribution is proposed to suppress redundant information and noise. A large number of experiments on real data sets show that the Resformer method is superior to the existing improved model and standard Transformer method.
computer science, artificial intelligence
What problem does this paper attempt to address?