HyperKAN: Kolmogorov-Arnold Networks make Hyperspectral Image Classificators Smarter

Valeriy Lobanov,Nikita Firsov,Evgeny Myasnikov,Roman Khabibullin,Artem Nikonorov
2024-09-06
Abstract:In traditional neural network architectures, a multilayer perceptron (MLP) is typically employed as a classification block following the feature extraction stage. However, the Kolmogorov-Arnold Network (KAN) presents a promising alternative to MLP, offering the potential to enhance prediction accuracy. In this paper, we propose the replacement of linear and convolutional layers of traditional networks with KAN-based counterparts. These modifications allowed us to significantly increase the per-pixel classification accuracy for hyperspectral remote-sensing images. We modified seven different neural network architectures for hyperspectral image classification and observed a substantial improvement in the classification accuracy across all the networks. The architectures considered in the paper include baseline MLP, state-of-the-art 1D (1DCNN) and 3D convolutional (two different 3DCNN, NM3DCNN), and transformer (SSFTT) architectures, as well as newly proposed M1DCNN. The greatest effect was achieved for convolutional networks working exclusively on spectral data, and the best classification quality was achieved using a KAN-based transformer architecture. All the experiments were conducted using seven openly available hyperspectral datasets. Our code is available at <a class="link-external link-https" href="https://github.com/f-neumann77/HyperKAN" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The paper aims to address the problem of Hyperspectral Images (HSI) classification. Specifically, it proposes a new architecture based on Kolmogorov-Arnold Networks (KAN), named HyperKAN, to enhance the performance of traditional neural networks in HSI classification tasks. ### Main Objectives of the Paper: 1. **Replace Traditional Layers**: Investigate replacing traditional linear, convolutional, and attention layers with corresponding KAN layers to improve existing neural network architectures (such as MLP, 1D/2D/3D convolutional networks, and transformers). 2. **Improve Classification Accuracy**: Validate the application of KAN layers in six different neural network architectures through experiments and demonstrate their significant performance improvement on seven public hyperspectral datasets. 3. **Address Data Scarcity**: Explore the performance of KAN networks in small sample scenarios, considering the difficulty of HSI data annotation and the limited amount of data. ### Main Contributions: - Introduced three types of KAN blocks to replace traditional classification and feature extraction layers. - Modified six different neural network architectures (including 1D-CNN, 2D-CNN, 3D-CNN, and transformers) and demonstrated the superior performance of these modified networks in HSI classification tasks. - Verified the advantages of six KAN architectures on seven public hyperspectral datasets. Through these improvements, the paper demonstrates the potential of KAN in network design, particularly its advantages in handling high-dimensional nonlinear data.