A Lightweight 1-D Convolution Augmented Transformer with Metric Learning for Hyperspectral Image Classification

Xiang Hu,Wenjing Yang,Hao Wen,Yu Liu,Yuanxi Peng
DOI: https://doi.org/10.3390/s21051751
IF: 3.9
2021-03-03
Sensors
Abstract:Hyperspectral image (HSI) classification is the subject of intense research in remote sensing. The tremendous success of deep learning in computer vision has recently sparked the interest in applying deep learning in hyperspectral image classification. However, most deep learning methods for hyperspectral image classification are based on convolutional neural networks (CNN). Those methods require heavy GPU memory resources and run time. Recently, another deep learning model, the transformer, has been applied for image recognition, and the study result demonstrates the great potential of the transformer network for computer vision tasks. In this paper, we propose a model for hyperspectral image classification based on the transformer, which is widely used in natural language processing. Besides, we believe we are the first to combine the metric learning and the transformer model in hyperspectral image classification. Moreover, to improve the model classification performance when the available training samples are limited, we use the 1-D convolution and Mish activation function. The experimental results on three widely used hyperspectral image data sets demonstrate the proposed model’s advantages in accuracy, GPU memory cost, and running time.
engineering, electrical & electronic,chemistry, analytical,instruments & instrumentation
What problem does this paper attempt to address?
The paper mainly addresses the following issues: 1. **Proposing a lightweight 1-D convolution-enhanced Transformer model**: For the task of hyperspectral image classification, the paper proposes a lightweight network model based on the Transformer. This model combines the powerful representation capabilities of the Transformer with the advantages of 1-D convolution operations to reduce GPU memory consumption and training time. 2. **Introducing a metric learning mechanism**: To improve the discriminative ability of the model, especially in cases with limited training samples, the paper integrates a metric learning mechanism (specifically, center loss) with the Transformer model. This is the first time such a method has been applied in hyperspectral image classification. 3. **Improving the Transformer model**: By using 1-D convolution layers instead of traditional linear projection layers and adopting the Mish activation function to enhance classification performance, the Transformer model structure is further optimized. In summary, the paper aims to provide an efficient and accurate new method for hyperspectral image classification, especially in scenarios with limited training data, by combining Transformer, metric learning, and specific network design strategies to improve classification results.