Abstract:In view of the small size and dense distribution of remote sensing image targets, this paper adds a detection head P2 specifically for small-scale targets on the basis of the three detection layers of the original YOLOv5 model, and involves the shallow high-resolution feature map in the subsequent multi-scale feature fusion module. The problem of losing the key feature information of the small-scale target in the process of multiple downsampling is effectively avoided. Firstly, an enhanced multi-scale feature fusion pyramid network DSI-FPN is designed. The FPN+PAN network is optimized by using DepthwiseSparable Convolution and Involution operators with fewer parameters and computations, as well as a spatial attention mechanism to generate feature graphs with richer information for network detection tasks. Secondly, we propose an adaptive channel spatial attention mechanism SCBAM, which introduces a self-attention mechanism into CBAM module to add non-local information to the interaction that originally had only local information, breaks the convolution kernel limit, expands the model receptive field, and improves the feature expression ability of the model. Thirdly, in order to solve the problem of insufficient computing power when deploying the target detector for equipment, we propose a network knowledge distillation framework for joint teachers based on the feature layer. The distillation loss of teacher is designed, and the trend of student online learning is adjusted dynamically by balancing the contributions of teacher network and truth value. The detection accuracy of the student network is obviously improved, and the parameters and model size of the network are effectively reduced. Finally, Comparing with other remote sensing image object detection methods, the experimental results show that the approach presented has better detection effect for small-scale targets of remote sensing images under different lighting conditions. The detection accuracy reached 43.9%, and 7.4% higher than that of the original model. After knowledge distillation, the model parameters are reduced to 1/3 of the original, and the detection accuracy is 40.2%.

Multi-Scale Object Detection Using Feature Fusion Recalibration Network

Scale-Balanced Real-Time Object Detection with Varying Input-Image Resolution

Small Object Detection using Multi-scale Feature Fusion and Attention

Multi-level Feature Fusion Pyramid Network for Object Detection

A multi-scale pyramid feature fusion-based object detection method for remote sensing images

Pyramid attention object detection network with multi-scale feature fusion

Improving Object Detection in YOLOv8n with the C2f-f Module and Multi-Scale Fusion Reconstruction

Object Detection in Remote Sensing Images Based on Adaptive Multi-Scale Feature Fusion Method

Object Detection of Remote Sensing Image Based on Multi-Scale Feature Fusion and Attention Mechanism

Multi-scale feature selection and fusion for object detection

Exploring Multi-scale Deep Feature Fusion for Object Detection.

Object Detection Based on Feature Scale Fusion and Feature Scale Enhancement.

Object detection model of coal mine rescue robot based on multi-scale feature fusion

Multi-scale detector optimized for small target

Multiscale Feature Adaptive Fusion for Object Detection in Optical Remote Sensing Images

Multi-scale Fusion with Context-aware Network for Object Detection

An improved YOLOv5 method for large objects detection with multi-scale feature cross-layer fusion network

Multi-Scale Reinforcement Learning Strategy for Object Detection

Object Detection Based on Improved YOLOv3-tiny

Multi-RPN Fusion-based Sparse PCA-CNN Approach to Object Detection and Recognition for Robot-aided Visual System

Scale-Insensitive Object Detection Via Attention Feature Pyramid Transformer Network