Abstract:In recent decades, remote sensing object counting has attracted increasing attention from academia and industry due to its potential benefits in urban traffic, public safety, and road planning. However, this issue is becoming a challenge for computer vision because of various technical barriers, such as large-scale variation, complex background interference, and nonuniform density distribution. Recent results show hopeful prospects for object counting using convolutional neural networks (CNNs), but most existing CNN-based methods draw on larger and more complex architectures, which leads to a huge computational and storage burdens, severely limiting their application in real-world scenarios. In this article, a lightweight multiscale feature fusion network for remote sensing object counting, named LMSFFNet, is presented to achieve a better balance between the running speed of the network and the counting accuracy. Specifically, in the encoding process, we select a MobileViT module as the backbone of the network to reduce the numbers of network parameters and computing cost. In return, a cascade structure of the channel–spatial attention mechanisms compensates for the weaker feature extraction ability of the lightweight network. In the decoding process, a lightweight multiscale context fusion module (LMCFM) as a multiscale feature fusion module is developed to solve the problem that the number of parameters increases with the expansion of the object scale when extracting multiscale features. In addition, a lightweight counting scale pooling module (LCSPM) is used to mine the subtle features of the target object. Two kinds of typical object counting experiments, namely, experiments on remote sensing benchmarks (RSOC dataset) and crowd benchmarks (ShanghaiTech, UCF-QNRF, and UCF_CC_50 datasets), show the effectiveness of the proposed method.

Weight-sharing Multi-Stage Multi-Scale Ensemble Convolutional Neural Network.

Remote Sensing Scene Image Classification Model Based on Multi-Scale Features and Attention Mechanism

Multi-scale Matching Networks for Semantic Correspondence

Multi-scale attention network for image super-resolution

Multi-scale Unified Network for Image Classification

Enhanced Multi-Scale Feature Adaptive Fusion Sparse Convolutional Network for Large-Scale Scenes Semantic Segmentation

Multi-scale Convolution Aggregation and Stochastic Feature Reuse for DenseNets

A Lightweight Multi-Scale Channel Attention Network for Image Super-Resolution.

Multi-scale feature aggregation network for Image super-resolution

Lightweight multi-scale distillation attention network for image super-resolution

Consecutive multiscale feature learning-based image classification model

Multi-scale strip-shaped convolution attention network for lightweight image super-resolution

Multi-scale skip-connection network for image super-resolution

Multiscale Conditional Regularization for Convolutional Neural Networks

A channel-wise multi-scale network for single image super-resolution

Lightweight single-image super-resolution via multi-scale feature fusion CNN and multiple attention block

A Lightweight Multiscale Feature Fusion Network for Remote Sensing Object Counting

Coarse-to-fine Trained Multi-Scale Convolutional Neural Networks for Image Classification.

SCWC: Structured channel weight sharing to compress convolutional neural networks

Wide Weighted Attention Multi-Scale Network for Accurate MR Image Super-Resolution

Multi-scale features fused network with multi-level supervised path for crowd counting