Abstract:Recent work has established an alternative to traditional multi-layer perceptron neural networks in the form of Kolmogorov-Arnold Networks (KAN). The general KAN framework uses learnable activation functions on the edges of the computational graph followed by summation on nodes. The learnable edge activation functions in the original implementation are basis spline functions (B-Spline). Here, we present a model in which learnable grids of B-Spline activation functions are replaced by grids of re-weighted sine functions. We show that this leads to better or comparable numerical performance to B-Spline KAN models on the MNIST benchmark, while also providing a substantial speed increase on the order of 4-8 times.

What problem does this paper attempt to address?

The paper primarily aims to address the following issues: 1. **Improving the Performance of Kolmogorov-Arnold Networks (KAN)**: KAN is a neural network architecture based on the Kolmogorov-Arnold representation theorem, designed to replace traditional multilayer perceptrons (MLP). However, existing KAN implementations (such as B-SplineKAN, which uses B-spline activation functions) perform well on certain tasks but still lag behind MLPs in terms of speed and performance. Therefore, this paper proposes a new KAN implementation—SineKAN, which uses sine activation functions instead of B-spline activation functions. 2. **Enhancing Model Speed and Accuracy**: By introducing sine activation functions, SineKAN not only demonstrates better accuracy in MNIST benchmarks but also achieves inference speeds 4 to 8 times faster than B-SplineKAN. This indicates that SineKAN has greater practical utility. 3. **Exploring the Advantages of Sine Activation Functions**: Sine activation functions can provide stronger numerical performance and better maintain the stability of output values in deep models, thereby avoiding the issue of value collapse in deep models. Additionally, sine activation functions enable better multilayer scalability. 4. **Optimizing Weight Initialization Strategies**: To further enhance model performance, the paper also explores a new weight initialization strategy to ensure model stability and consistency across different grid sizes. This strategy helps the model maintain good performance at various depths. In summary, this paper aims to improve the overall performance of the KAN architecture by introducing sine activation functions and optimizing weight initialization strategies, making it superior to the existing B-SplineKAN model in terms of speed, accuracy, and stability.

SineKAN: Kolmogorov-Arnold Networks Using Sinusoidal Activation Functions

KAN: Kolmogorov-Arnold Networks

Sinc Kolmogorov-Arnold Network and Its Applications on Physics-informed Neural Networks

KKANs: Kurkova-Kolmogorov-Arnold Networks and Their Learning Dynamics

Convolutional Kolmogorov-Arnold Networks

Smooth Kolmogorov Arnold networks enabling structural knowledge representation

MonoKAN: Certified Monotonic Kolmogorov-Arnold Network

Rethinking the Function of Neurons in KANs

BSRBF-KAN: A combination of B-splines and Radial Basis Functions in Kolmogorov-Arnold Networks

EKAN: Equivariant Kolmogorov-Arnold Networks

Activation Space Selectable Kolmogorov-Arnold Networks

Kolmogorov-Arnold Networks in Low-Data Regimes: A Comparative Study with Multilayer Perceptrons

rKAN: Rational Kolmogorov-Arnold Networks

NEW DATA ANALYSIS ALGORITHMS BASED ON SPLINE VERSIONS OF KOLMOGOROV ARNOLD NETWORKS

LArctan-SKAN: Simple and Efficient Single-Parameterized Kolmogorov-Arnold Networks using Learnable Trigonometric Function

Kolmogorov-Arnold Networks (KAN) for Time Series Classification and Robust Analysis

On the expressiveness and spectral bias of KANs

Hardware Acceleration of Kolmogorov-Arnold Network (KAN) for Lightweight Edge Inference

A Survey on Kolmogorov-Arnold Network

Kolmogorov-Arnold Network Autoencoders

On Training of Kolmogorov-Arnold Networks