Learning a 3D-CNN and Convolution Transformers for Hyperspectral Image Classification

Yufan Wang,Xiaodong Yu,Xiaoyan Wen,Xiaohui Li,Hongbin Dong,Shuying Zang
DOI: https://doi.org/10.1109/lgrs.2024.3365615
IF: 5.343
2024-01-01
IEEE Geoscience and Remote Sensing Letters
Abstract:Hyperspectral image classification is an important but challenging task. Conventional convolutional neural networks (CNNs) are able to extract local spectral spatial features but ignore long-range dependencies and global features. To address this problem, we propose a new model combining 3D-CNN and Convolutional Vision Transformer, aiming to improve the performance of the image recognition task by utilizing the advantages of CNN in local feature extraction while retaining the advantages of Transformer in long-range dependency processing. Our model is tested on three publicly available hyperspectral image datasets, and the results show that our model outperforms other state-of-the-art models in terms of classification accuracy and robustness. The source code for our work is available at [https://github.com/Dreamvai/ViT-Convolution]. The model proposed in this letter provides a new idea for hyperspectral image classification and expands a new field for the application of convolutional transformers. In the future, we intend to further explore the performance of the convolutional transformer and the possibility of combining it with other types of data.
imaging science & photographic technology,remote sensing,engineering, electrical & electronic,geochemistry & geophysics
What problem does this paper attempt to address?