Abstract:In recent years, object detection has become one of the most prominent components in computer vision. State-of-the-art object detectors now employ convolutional neural networks (CNNs) techniques alongside other deep neural network techniques to improve detection performance and accuracy. Most of the recent object detectors employ feature pyramid network (FPN) and their variants while others use combinations of attention mechanisms to achieve better performance. The open question is object detectors inconsistency between the lower layer features, their resolution receptive field and semantic information with the upper layers features in detecting objects. Although some researchers have attempted to address this issue, we exploit ideas surrounding the field and proposed a more prominent architecture called dense attention feature pyramid network (DAF-Net) for multiscale object detection. DAF-Net consists of two attention models, the spatial attention model and channel attention model. Different from other attention models, we proposed lightweight attention models which are fully data-driven then implemented a dense connected attention FPN to reduce the model's complexity and resolve the learning of redundant feature maps. First, we developed the two attention models then used only the spatial attention model in the backbone of our network, and finally used both attention models to filter and maintain a steady flow of semantic information from lower layers to improve the model's accuracy and efficiency. Experimental results on underwater images from the National Natural Science Foundation of China (NSFC) (Underwater Image Dataset, National Natural Science Foundation of China (NSFC). Online, retrieved from http://www.cnurpc.org/index.html), MS COCO dataset, and PASCAL VOC dataset indicate higher accuracy and better detection results using the proposed model compared to the benchmark model YOLOX-Darknet53 (Ge in Yolox: Exceeding yolo series in 2021. arXiv preprint arXiv:2107.08430). Our model achieved 70.2mAP, 48.9 mAP, and 83.9 mAP on (NSFC), MS COCO, and PASCAL VOC datasets, respectively, compared with benchmark model 68.9mAP on (NSFC), 47.7mAP on MS COCO, and 82.4mAP on PASCAL VOC.

ADOSMNet: a Novel Visual Affordance Detection Network with Object Shape Mask Guided Feature Encoders

Object affordance detection with relationship-aware network

High-level Object Affordance Recognition.

One-Shot Object Affordance Detection in the Wild

EBiDA-FPN: Enhanced Bi-Directional Attention Feature Pyramid Network for Object Detection

Feature Fusion One-Stage Visual Affordance Detector

3D AffordanceNet: A Benchmark for Visual Object Affordance Understanding

Object affordance detection with boundary-preserving network for robotic manipulation tasks

CoADNet: Collaborative Aggregation-and-Distribution Networks for Co-Salient Object Detection

AGO-Net: Association-Guided 3D Point Cloud Object Detection Network

Dense Attentive Feature Enhancement for Salient Object Detection

One-Shot Affordance Detection

DAANet: Dual Attention Aggregating Network for Salient Object Detection.

SAFNet: A Semi-Anchor-Free Network with Enhanced Feature Pyramid for Object Detection.

OASNet: Object Affordance State Recognition Network with Joint Visual Features and Relational Semantic Embeddings

Class-Aware Dual-Supervised Aggregation Network for Video Object Detection

LAC-Net: Linear-Fusion Attention-Guided Convolutional Network for Accurate Robotic Grasping Under the Occlusion

SAANet: Spatial Adaptive Alignment Network for Object Detection in Automatic Driving

Object detection based on an adaptive attention mechanism

AShapeFormer: Semantics-Guided Object-Level Active Shape Encoding for 3D Object Detection Via Transformers

DAF-Net: dense attention feature pyramid network for multiscale object detection