Abstract:Convolutional neural networks (CNN) have attracted tremendous attention in the remote sensing community due to its excellent performance in different domains. Especially for remote sensing scene classification, the CNN-based methods have brought a great breakthrough. However, it is not feasible to fully design and train a new CNN model for remote sensing scene classification, as this usually requires a large number of training samples and high computational costs. To alleviate these limitations of fully training a new model, some work attempts to use the pretrained CNN models as feature extractors to build feature representation of scene images for classification and has achieved impressive results. In this scheme, how to construct feature representation of scene image via the pretrained CNN model becomes the key process. Existing studies paid a little attention to build more discriminative feature representation by exploring the potential benefits of multilayer features from a single CNN model and different feature representations from multiple CNN models. To this end, this paper presents a fusion strategy to build the feature representation of the scene images by integrating multilayer features of a single pretrained CNN model, and extends it to a framework of multiple CNN models. For these purposes, a multiscale improved Fisher kernel coding method is used to build feature representation of the scene images on convolutional layers, and a feature fusion approach based on two feature subspace learning methods [principal component analysis (PCA)/spectral regression kernel discriminant analysis and PCA/spectral regression kernel locality preserving projection] is proposed to construct final fused features for scene classification. For validation and comparison purposes, the proposed approaches are evaluated with two challenging high-resolution remote sensing datasets and shows the competitive performance compared with existing state-of-the-art baselines such as fully trained CNN models, fine tuning CNN models, and other related works.

An End-to-End Local-Global-Fusion Feature Extraction Network for Remote Sensing Image Scene Classification

LPNet: A Remote Sensing Scene Classification Method Based on Large Kernel Convolution and Parameter Fusion

A lightweight and stochastic depth residual attention network for remote sensing scene classification

Dynamic Convolution Covariance Network Using Multi-Scale Feature Fusion for Remote Sensing Scene Image Classification

Multilayer Feature Fusion Network With Spatial Attention and Gated Mechanism for Remote Sensing Scene Classification

MGFN: A Multi-Granularity Fusion Convolutional Neural Network for Remote Sensing Scene Classification

RSCNN: A CNN-Based Method to Enhance Low-Light Remote-Sensing Images

Local-global feature fusion network for hyperspectral image classification

Progressive Feature Fusion Framework Based on Graph Convolutional Network for Remote Sensing Scene Classification

A Deep Neural Network Combined With Context Features for Remote Sensing Scene Classification

Remote Sensing Scene Classification by Local–Global Mutual Learning

Remote Sensing Scene Classification by Gated Bidirectional Network

Aerial Scene Classification Via Multilevel Fusion Based on Deep Convolutional Neural Networks.

A Combination of Lie Group Machine Learning and Deep Learning for Remote Sensing Scene Classification Using Multi-Layer Heterogeneous Feature Extraction and Fusion

Remote Sensing Scene Classification Based on Multi-Structure Deep Features Fusion

A Remote-Sensing Scene-Image Classification Method Based on Deep Multiple-Instance Learning with a Residual Dense Attention ConvNet

MGML: Multigranularity Multilevel Feature Ensemble Network for Remote Sensing Scene Classification

Feature and Model Level Fusion of Pretrained CNN for Remote Sensing Scene Classification

A Cross-Layer Nonlocal Network for Remote Sensing Scene Classification

Multi-Scale and Multi-Network Deep Feature Fusion for Discriminative Scene Classification of High-Resolution Remote Sensing Images

Few-shot remote sensing scene classification based on multi subband deep feature fusion