SigKAN: Signature-Weighted Kolmogorov-Arnold Networks for Time Series

Hugo Inzirillo,Remi Genet
2024-06-26
Abstract:We propose a novel approach that enhances multivariate function approximation using learnable path signatures and Kolmogorov-Arnold networks (KANs). We enhance the learning capabilities of these networks by weighting the values obtained by KANs using learnable path signatures, which capture important geometric features of paths. This combination allows for a more comprehensive and flexible representation of sequential and temporal data. We demonstrate through studies that our SigKANs with learnable path signatures perform better than conventional methods across a range of function approximation challenges. By leveraging path signatures in neural networks, this method offers intriguing opportunities to enhance performance in time series analysis and time series forecasting, among other fields.
Machine Learning
What problem does this paper attempt to address?
The paper primarily focuses on addressing the problem of multivariate time series forecasting, particularly in financial applications such as predicting trading volumes and asset price movements. Specifically, the authors propose a novel method—Signature-Weighted Kolmogorov-Arnold Networks (SigKANs)—to enhance the capability of multivariate function approximation. The core contribution of SigKANs lies in combining Path Signatures with Kolmogorov-Arnold Networks (KANs). Path Signatures are derived from rough path theory and can capture significant geometric features of paths or trajectories. By using learnable path signatures, this method can better represent contextual information in sequences and time-series data. The architecture of SigKANs mainly includes two parts: 1. **Gated Residual KANs**: This part allows for the regulation of information flow and improves the interpretability of the network. 2. **Learnable Path Signature Layer**: This layer computes the path signature for each input path and weights it using learnable parameters. The paper mentions two tasks to validate the performance of SigKANs: 1. **Predicting Trading Volume**: Using trading volume data of multiple assets to predict the future trading volume of Bitcoin. 2. **Predicting Future Absolute Returns**: Using only Bitcoin data to predict its future absolute price changes. In terms of experimental setup, the paper details data preprocessing, the choice of loss function, and training specifics. For example, for the trading volume prediction task, the data is divided into training and test sets, and Root Mean Square Error (RMSE) is used as the loss function for optimization. Additionally, the performance of SigKANs is compared with several other models (such as TKAN, GRU, and LSTM). Experimental results show that SigKANs outperform traditional simple reference models (such as TKAN, GRU, and LSTM) across multiple prediction horizons, especially in short-term forecasting. Furthermore, SigKANs demonstrate more stable performance compared to models based on recurrent networks. Although SigKANs have a relatively large number of parameters, their advantages in performance and stability are still significant. Overall, the paper introduces SigKANs as a powerful tool for handling complex multivariate time series forecasting problems, with broad application prospects in finance and other related fields.