Poly Kernel Inception Network for Remote Sensing Detection

Xinhao Cai,Qiuxia Lai,Yuwei Wang,Wenguan Wang,Zeren Sun,Yazhou Yao

2024-03-20

Abstract:Object detection in remote sensing images (RSIs) often suffers from several increasing challenges, including the large variation in object scales and the diverse-ranging context. Prior methods tried to address these challenges by expanding the spatial receptive field of the backbone, either through large-kernel convolution or dilated convolution. However, the former typically introduces considerable background noise, while the latter risks generating overly sparse feature representations. In this paper, we introduce the Poly Kernel Inception Network (PKINet) to handle the above challenges. PKINet employs multi-scale convolution kernels without dilation to extract object features of varying scales and capture local context. In addition, a Context Anchor Attention (CAA) module is introduced in parallel to capture long-range contextual information. These two components work jointly to advance the performance of PKINet on four challenging remote sensing detection benchmarks, namely DOTA-v1.0, DOTA-v1.5, HRSC2016, and DIOR-R.

Computer Science

What problem does this paper attempt to address?

The paper mainly addresses the problem of object detection in remote sensing images, particularly how to effectively handle the significant variations in object scales and diverse contextual information. To solve these issues, the authors propose a multi-scale convolutional network called the Poly Kernel Inception Network (PKINet). The core contributions of PKINet are: 1. **Multi-scale texture feature extraction**: By employing depthwise separable convolution kernels of different sizes (without dilation), PKINet can extract multi-scale texture features at different receptive fields and adaptively fuse these features through a channel fusion mechanism to capture local contextual information. 2. **Long-range context capture**: The introduction of the Context Anchor Attention (CAA) module captures long-range contextual information, further enhancing the feature representation of the central region. 3. **Lightweight design**: Utilizing depthwise separable convolutions and 1D convolutions, the model has fewer parameters and high computational efficiency. Through the above methods, PKINet demonstrates significant performance improvements on four challenging remote sensing detection benchmark datasets (DOTA-v1.0, DOTA-v1.5, HRSC2016, and DIOR-R). Experimental results show that PKINet not only effectively handles variations in object scales but also fully leverages the contextual information around objects, thereby improving the accuracy of object detection in remote sensing images.

Poly Kernel Inception Network for Remote Sensing Detection

Dynamic Convolution Covariance Network Using Multi-Scale Feature Fusion for Remote Sensing Scene Image Classification

Large Selective Kernel Network for Remote Sensing Object Detection

Remote Sensing Object Detection in Disturbed Environment Based on Improved RTMDet Network

RADet: Refine Feature Pyramid Network and Multi-Layer Attention Network for Arbitrary-Oriented Object Detection of Remote Sensing Images

Hierarchical Kernel Interaction Network for Remote Sensing Object Counting

A Self-Supplementary and Revised Network for Remote Sensing Object Detection

SDSDet: A real-time object detector for small, dense, multi-scale remote sensing objects

Info-FPN: An Informative Feature Pyramid Network for object detection in remote sensing images

Multi-scale Dense Object Detection in Remote Sensing Imagery Based on Keypoints

Adaptive adjacent context negotiation network for object detection in remote sensing imagery

An Effective and Lightweight Hybrid Network for Object Detection in Remote Sensing Images

A small object detection network for remote sensing based on CS-PANet and DSAN

Object Detection in Remote Sensing Imagery Based on Prototype Learning Network With Proposal Relation

A Two-Way Dense Feature Pyramid Networks for Object Detection of Remote Sensing Images

Deep Adaptive Proposal Network for Object Detection in Optical Remote Sensing Images

TCANet: Triple Context-Aware Network for Weakly Supervised Object Detection in Remote Sensing Images

A Parameter-Free Pixel Correlation-Based Attention Module for Remote Sensing Object Detection

Deep Hash Assisted Network for Object Detection in Remote Sensing Images.

PCViT: A Pyramid Convolutional Vision Transformer Detector for Object Detection in Remote-Sensing Imagery

Cross-Layer Attention Network for Small Object Detection in Remote Sensing Imagery