Scene Classification Using Multi-Scale Deeply Described Visual Words

Wenzhi Zhao,Shihong Du
DOI: https://doi.org/10.1080/01431161.2016.1207266
IF: 3.531
2016-01-01
International Journal of Remote Sensing
Abstract:This article presents a deep learning-based Multi-scale Bag-of-Visual Words MBVW representation for scene classification of high-resolution aerial imagery. Specifically, the convolutional neural network CNN is introduced to learn and characterize the complex local spatial patterns at different scales. Then, the learnt deep features are exploited in a novel way to generate visual words. Moreover, the MBVW representation is constructed using the statistics of the visual word co-occurrences at different scales, which are derived from a training data set. We apply our technique to the challenging aerial scene data set: the University of California UC Merced data set consisting of 21 different aerial scene categories with sub-metre resolution. The experimental results show that the statistics of deeply described visual words can characterize the scene well and improve classification accuracy. It demonstrates that the proposed method is highly effective in the scene classification of high-resolution remote-sensing imagery.
What problem does this paper attempt to address?