Dynamic Convolution Covariance Network Using Multi-Scale Feature Fusion for Remote Sensing Scene Image Classification

Xinyu Wang,Furong Shi,Haixia Xu,Liming Yuan,Xianbin Wen
DOI: https://doi.org/10.1109/jstars.2024.3456854
2024-01-01
Abstract:The rapid increase in spatial resolution of remote sensing scene images (RSIs) has led to a concomitant increase in the complexity of the spatial contextual information contained therein. The coexistence of numerous smaller features makes it challenging to accurately locate and mine these features, which in turn makes accurate interpretation difficult. In order to address the aforementioned issues, this article proposes a dynamic convolution covariance network (ODFMN) based on omni-dimensional dynamic convolution, which can extract multidimensional and multiscale features from RSIs and perform statistical higher-order representation of feature information. First, in order to fully exploit the complex spatial context information of RSIs and at the same time improve the limitation of a single static convolution kernel for feature extraction, we constructed a omni-dimensional feature extraction module based on dynamic convolution, which fully extracts the 4-D information within the convolution kernel. Then, to make full use of the full-dimensional feature information extracted from each level in the network, the feature representation is enriched by constructing multiscale feature fusion module to establish relationships from local to global. Finally, higher order statistical information is employed to address the challenge of representing first-order information for smaller object features, which is inherently difficult to do. Experiments conducted on publicly available datasets have demonstrated that the method achieves high classification accuracies of 99.04%, 95.34%, and 92.50%, respectively. Furthermore, the method has been verified to have high capture accuracy for feature target contours, shapes, and spatial context information through feature visualization.
What problem does this paper attempt to address?