TKAN: Temporal Kolmogorov-Arnold Networks

Remi Genet,Hugo Inzirillo
2024-06-06
Abstract:Recurrent Neural Networks (RNNs) have revolutionized many areas of machine learning, particularly in natural language and data sequence processing. Long Short-Term Memory (LSTM) has demonstrated its ability to capture long-term dependencies in sequential data. Inspired by the Kolmogorov-Arnold Networks (KANs) a promising alternatives to Multi-Layer Perceptrons (MLPs), we proposed a new neural networks architecture inspired by KAN and the LSTM, the Temporal Kolomogorov-Arnold Networks (TKANs). TKANs combined the strenght of both networks, it is composed of Recurring Kolmogorov-Arnold Networks (RKANs) Layers embedding memory management. This innovation enables us to perform multi-step time series forecasting with enhanced accuracy and efficiency. By addressing the limitations of traditional models in handling complex sequential patterns, the TKAN architecture offers significant potential for advancements in fields requiring more than one step ahead forecasting.
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
The paper attempts to address the issue of improving the accuracy and stability of multi-step predictions in time series forecasting. Specifically, the paper proposes a new neural network architecture called Temporal Kolmogorov-Arnold Networks (TKAN), which combines the advantages of Recurrent Neural Networks (RNN) and Kolmogorov-Arnold Networks (KAN). This architecture aims to overcome the gradient vanishing or exploding problems that traditional RNNs face when dealing with long-term dependencies. The paper introduces Recurrent Kolmogorov-Arnold Networks (RKAN) layers with gating mechanisms and incorporates the characteristics of Long Short-Term Memory (LSTM) units to achieve efficient memory management of sequence data. Experimental results show that TKAN outperforms traditional GRU and LSTM models in multi-step prediction tasks, demonstrating higher stability and generalization ability. This advantage is particularly evident when handling prediction tasks over longer time spans.