FTMixer: Frequency and Time Domain Representations Fusion for Time Series Modeling

Zhengnan Li,Yunxiao Qin,Xilong Cheng,Yuting Tan
2024-08-10
Abstract:Time series data can be represented in both the time and frequency domains, with the time domain emphasizing local dependencies and the frequency domain highlighting global dependencies. To harness the strengths of both domains in capturing local and global dependencies, we propose the Frequency and Time Domain Mixer (FTMixer). To exploit the global characteristics of the frequency domain, we introduce the Frequency Channel Convolution (FCC) module, designed to capture global inter-series dependencies. Inspired by the windowing concept in frequency domain transformations, we present the Windowing Frequency Convolution (WFC) module to capture local dependencies. The WFC module first applies frequency transformation within each window, followed by convolution across windows. Furthermore, to better capture these local dependencies, we employ channel-independent scheme to mix the time domain and frequency domain patches. Notably, FTMixer employs the Discrete Cosine Transformation (DCT) with real numbers instead of the complex-number-based Discrete Fourier Transformation (DFT), enabling direct utilization of modern deep learning operators in the frequency domain. Extensive experimental results across seven real-world long-term time series datasets demonstrate the superiority of FTMixer, in terms of both forecasting performance and computational efficiency.
Machine Learning
What problem does this paper attempt to address?
This paper attempts to solve two main problems in time - series prediction: 1. **Handling the complexity of complex - number representations**: - Existing methods usually rely on the Discrete Fourier Transform (DFT), which introduces complex - number representations of time - series data. However, deep - learning techniques such as Batch Normalization and activation functions are not suitable for handling these complex numbers. Although it is possible to adapt to complex numbers by processing the real and imaginary parts separately, this will increase the number of parameters and computational complexity and may perform poorly. 2. **Loss of local information**: - Global frequency - domain transformations mainly capture global dependencies and may mask critical changes and phenomena (such as sudden peaks and irregular patterns), which are crucial for accurate prediction and understanding of time - series dynamics. To solve these problems, the paper proposes a new method named **Frequency and Time domain Mixer (FTMixer)**. This method effectively combines the advantages of the time domain and the frequency domain in the following ways: - **Discrete Cosine Transformation (DCT)**: Unlike the Discrete Fourier Transform (DFT), the DCT operates only with real numbers, making it more compatible with modern deep - learning techniques. In addition, the DCT uses magnitude to represent frequency - domain information, simplifying the calculation of the loss function in the frequency domain. - **Frequency Channel Convolution (FCC) module**: After embedding the entire sequence into the frequency domain, convolution is performed to comprehensively analyze global patterns. - **Windowed Frequency - Time Convolution (WFTC) module**: The time series is divided into segments of different scales, a frequency - domain transformation is applied within each segment, and then convolution is performed on these segments, thereby effectively capturing local changes. - **Dual - Domain Loss Function (DDLF)**: Calculate the loss in the time domain and the frequency domain respectively, taking advantage of the DCT's ability to concentrate energy into fewer coefficients, improving the model's ability to capture and utilize domain - specific features. Through these innovations, the FTMixer aims to overcome the limitations of existing methods and provide a more effective time - series prediction method. Experimental results show that the FTMixer outperforms existing methods in both prediction performance and computational efficiency on seven real - world long - term time - series data sets.