ContiFormer: Continuous-Time Transformer for Irregular Time Series Modeling

Yuqi Chen,Kan Ren,Yansen Wang,Yuchen Fang,Weiwei Sun,Dongsheng Li
2024-02-16
Abstract:Modeling continuous-time dynamics on irregular time series is critical to account for data evolution and correlations that occur continuously. Traditional methods including recurrent neural networks or Transformer models leverage inductive bias via powerful neural architectures to capture complex patterns. However, due to their discrete characteristic, they have limitations in generalizing to continuous-time data paradigms. Though neural ordinary differential equations (Neural ODEs) and their variants have shown promising results in dealing with irregular time series, they often fail to capture the intricate correlations within these sequences. It is challenging yet demanding to concurrently model the relationship between input data points and capture the dynamic changes of the continuous-time system. To tackle this problem, we propose ContiFormer that extends the relation modeling of vanilla Transformer to the continuous-time domain, which explicitly incorporates the modeling abilities of continuous dynamics of Neural ODEs with the attention mechanism of Transformers. We mathematically characterize the expressive power of ContiFormer and illustrate that, by curated designs of function hypothesis, many Transformer variants specialized in irregular time series modeling can be covered as a special case of ContiFormer. A wide range of experiments on both synthetic and real-world datasets have illustrated the superior modeling capacities and prediction performance of ContiFormer on irregular time series data. The project link is
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
The paper mainly discusses the challenges encountered in modeling irregular time series, which are widely used in real-world applications such as disease prevention, financial decision-making, and earthquake prediction. The characteristics of irregular time series are uneven sampling intervals, possible missing data, and the underlying data generation process is considered continuous. Traditional Recurrent Neural Networks (RNNs) and Transformer models can capture complex patterns, but their discrete nature limits their effectiveness in handling continuous time data. Although Neural Ordinary Differential Equations (Neural ODEs) and its variants have some effectiveness in modeling irregular time series, they may not capture the complex correlations in the sequences. To address these issues, the paper proposes a new model called ContiFormer, which extends the relationship modeling of Transformer to the continuous time domain. It combines the continuous dynamic modeling capability of Neural ODEs and the attention mechanism of Transformer to break the discreteness of Transformer. ContiFormer defines the potential trajectory for each observation point and extends the dot product operation of Transformer to the continuous time domain to capture the complex and evolving relationships between observations. The main contributions of the paper include: 1. Introducing a continuous-time Transformer model, which is the first time that the continuous-time mechanism is incorporated into the attention computation of Transformer. 2. Proposing a novel reparameterization method that allows parallel execution of continuous-time attention computation within different time ranges, solving the conflict between continuous time and parallel computation of Transformer. 3. Mathematically proving that various variants of Transformer can be viewed as special cases of ContiFormer, indicating the broader applicability of ContiFormer. 4. Demonstrating the superior performance of ContiFormer in modeling and predicting irregular time series through experiments on synthetic data and real-world datasets. The paper also compares ContiFormer with other methods such as RNNs, Transformers, and Neural ODEs, pointing out the advantages of ContiFormer in preserving long-term information and capturing continuous-time dynamics.