CNN-Based Multilayer Spatial–Spectral Feature Fusion and Sample Augmentation with Local and Nonlocal Constraints for Hyperspectral Image Classification

Jie Feng,Jiantong Chen,Liguo Liu,Xianghai Cao,Xiangrong Zhang,Licheng Jiao,Tao Yu
DOI: https://doi.org/10.1109/jstars.2019.2900705
IF: 4.715
2019-01-01
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Abstract:The extraction of joint spatial-spectral features has been proved to improve the classification performance of hyperspectral images (HSIs). Recently, utilizing convolutional neural networks (CNNs) to learn joint spatial-spectral features has become of great interest. However, the existing CNN models ignore complementary spatial-spectral information among the shallow and deep layers. Moreover, insufficient training samples in HSIs afflict these CNN models with overfitting problem. In order to address these problems, a novel CNN method for HSI classification is proposed. It considers multilayer spatial-spectral feature fusion and sample augmentation with local and nonlocal constraints, which is abbreviated as MSLN-CNN. In MSLN-CNN, a triple-architecture CNN is constructed to extract spatial-spectral features by cascading spectral features to dual-scale spatial features from shallow to deep layers. Then, multilayer spatial-spectral features are fused to learn complementary information among the shallow layers with detailed information and the deep layers with semantic information. Finally, the multilayer spatial-spectral feature fusion and classification are integrated into a unified network, and MSLN-CNN can be optimized in the end-to-end way. To alleviate the small sample size problem, the unlabeled samples having high confidences on local spatial constraint and nonlocal spectral constraint are selected and prelabeled. The nonlocal spectral constraint considers the structure information with spectrally similar samples in the nonlocal searching, while the local spatial con-straint utilizes the contextual information with spatially adjacent samples. Experimental results on several hyperspectral datasets demonstrate that the proposed method achieves more encouraging classification performance than the current state-of-the-art classification methods, especially with the limited training samples.
What problem does this paper attempt to address?