Semi-supervised Co-training Model Using Convolution and Transformer for Hyperspectral Image Classifica Tion

Feng Zhao,Xiqun Song,Junjie Zhang,Hanqiang Liu
DOI: https://doi.org/10.1109/lgrs.2024.3409351
IF: 5.343
2024-01-01
IEEE Geoscience and Remote Sensing Letters
Abstract:Deep learning algorithms have shown significant advantages in hyperspectral image (HSI) classification. However, these algorithms usually require a large number of labeled samples and the annotation of these samples consumes massive time and resource costs. To achieve effective classification results in situations with small samples, a semi-supervised co-training model using convolution and transformer (SCM-CT) is proposed in this letter. Firstly, two different networks, namely multi-scale parallel CNN (MPCNN) and global and local transformer fusion network (GLTFN), are designed as co-training learners to extract multi-scale spectral-spatial features and global-local combined features in HSIs, respectively. Secondly, to ensure two learners generate reliable predictions and utilize more unlabeled samples with low confidence pseudo-labels, a self-adaptive threshold and conflict pseudo-labeling (SATCP) strategy is proposed to facilitate the model to learn more valuable spectral-spatial information from conflict predictions and improve the convergence speed and model performance. Finally, to prevent the learners from stepping into the collapse, the discrepancy loss is computed to reduce the similarity between the features extracted by the two learners, forcing them to learn different information from the same input. Experimental results on University of Pavia, Salinas Valley, and Houston 2013 datasets show that SCM-CT achieves overall accuracies of 97.42%, 95.60%, and 95.20%, respectively, outperforming the state-of-the-art methods.
What problem does this paper attempt to address?