Abstract:Object detection is an essential task in computer vision. Recently, several convolution neural network (CNN)-based detectors have achieved a great success in natural scenes. However, for optical remote sensing images with a large scale of view, lower proportion of foreground target pixels and drastic differences in object scale present considerable challenges. To address these problems, we propose a novel one-stage detector called the full-scale object detection network (FSoD-Net) which consists of proposed multiscale enhancement network (MSE-Net) backbone cascaded with scale-invariant regression layers (SIRLs). First, MSE-Net provides the multiscale description enhancement by integrated the Laplace kernel with fewer parallel multiscale convolution layers. Second, SIRLs contain three different isolated regression branch layers (i.e., corresponding to small, medium, and large scales), which make default discrete scale bounding boxes (bboxes) cover full-scale object information in regression procedure. A novel specific scale joint loss is also designed that uses the softmax function combined with a strong <span class="mjpage"><svg xmlns:xlink="http://www.w3.org/1999/xlink" width="2.637ex" height="2.509ex" style="vertical-align: -0.671ex;" viewBox="0 -791.3 1135.4 1080.4" role="img" focusable="false" xmlns="http://www.w3.org/2000/svg"><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="matrix(1 0 0 -1 0 0)"> <use xlink:href="#MJMATHI-4C" x="0" y="0"></use> <use transform="scale(0.707)" xlink:href="#MJMAIN-31" x="963" y="-213"></use></g></svg></span> -norm constraint in each regression branch layer. It can further speed up the convergence and improve the classification scores of predicted bboxes. Finally, extensive experiments are carried on challenge data sets of large-scale dataset for object detection in aerial images (DOTA) and object detection in optical remote sensing images (DIOR) which contain multiple instances from different imaging platforms, and these results demonstrate that FSoD-Net can achieve better performance than other state-of-the-art one-stage detectors, and it can reach a mean average precision (mAP) of 75.33% on DOTA and 71.80% mAP on DIOR, respectively. Especially, the average precision (AP) of tiny object detection can improve 10%–20&-x0025; approximately.<svg xmlns="http://www.w3.org/2000/svg" style="display: none;"><defs id="MathJax_SVG_glyphs"><path stroke-width="1" id="MJMATHI-4C" d="M228 637Q194 637 192 641Q191 643 191 649Q191 673 202 682Q204 683 217 683Q271 680 344 680Q485 680 506 683H518Q524 677 524 674T522 656Q517 641 513 637H475Q406 636 394 628Q387 624 380 600T313 336Q297 271 279 198T252 88L243 52Q243 48 252 48T311 46H328Q360 46 379 47T428 54T478 72T522 106T564 161Q580 191 594 228T611 270Q616 273 628 273H641Q647 264 647 262T627 203T583 83T557 9Q555 4 553 3T537 0T494 -1Q483 -1 418 -1T294 0H116Q32 0 32 10Q32 17 34 24Q39 43 44 45Q48 46 59 46H65Q92 46 125 49Q139 52 144 61Q147 65 216 339T285 628Q285 635 228 637Z"></path><path stroke-width="1" id="MJMAIN-31" d="M213 578L200 573Q186 568 160 563T102 556H83V602H102Q149 604 189 617T245 641T273 663Q275 666 285 666Q294 666 302 660V361L303 61Q310 54 315 52T339 48T401 46H427V0H416Q395 3 257 3Q121 3 100 0H88V46H114Q136 46 152 46T177 47T193 50T201 52T207 57T213 61V578Z"></path></defs></svg>

Multiscale Semantic Fusion-Guided Fractal Convolutional Object Detection Network for Optical Remote Sensing Imagery

Multiscale semantic fusion-guided fractal convolutional object detection network for optical remote sensing imagery

SFSANet: Multiscale Object Detection in Remote Sensing Image Based on Semantic Fusion and Scale Adaptability

ℱ3-Net: Feature Fusion and Filtration Network for Object Detection in Optical Remote Sensing Images

An Adaptive Attention Fusion Mechanism Convolutional Network for Object Detection in Remote Sensing Images

Semantic Context-Aware Network for Multiscale Object Detection in Remote Sensing Images

Cross-Scale Feature Fusion for Object Detection in Optical Remote Sensing Images

Multiscale Feature Adaptive Fusion for Object Detection in Optical Remote Sensing Images

Adaptively Attentional Feature Fusion Oriented to Multiscale Object Detection in Remote Sensing Images

FSoD-Net: Full-Scale Object Detection From Optical Remote Sensing Imagery

A Task-Balanced Multiscale Adaptive Fusion Network for Object Detection in Remote Sensing Images

Multiscale Deformable Attention and Multilevel Features Aggregation for Remote Sensing Object Detection

Semantic Information Feature Aggregation Network for Object Detection in Remote Sensing Images

MFCANet: Multiscale Feature Context Aggregation Network for Oriented Object Detection in Remote-Sensing Images

MFIL-FCOS: A Multi-Scale Fusion and Interactive Learning Method for 2D Object Detection and Remote Sensing Image Detection

Channel Self-Attention Based Multiscale Spatial-Frequency Domain Network for Oriented Object Detection in Remote Sensing Imagery

A Multi-Feature Fusion and Attention Network for Multi-Scale Object Detection in Remote Sensing Images

Improving Multiscale Object Detection With Off-Centered Semantics Refinement

Multi-Scale Adaptive Feature Fusion Network For Semantic Segmentation In Remote Sensing Images

Shallow Multiplexing and Multiscale Dilation Convolution Combined Attention Based Oriented Object Detection in Remote Sensing Images

A Novel Multi-Model Decision Fusion Network For Object Detection In Remote Sensing Images