Abstract:With the continuous advancements in the field of computer vision, the performance of state-of-the-art (SOTA) methods in pedestrian detection has reached new heights. Despite this progress, challenges persist in constructing global information dependencies and context awareness due to limited receptive fields in most detectors. These constraints particularly affect edge and small pedestrian target detection. Our proposed solution, reparameterized dilated convolution (RDConv), strategically employs sawtooth dilation rates to broaden the receptive field without increasing computational costs. RDConv maintains the same cost as small convolutional kernels but offers a larger receptive field, enabling comprehensive modeling of the relationship between pedestrians and their environment, enhancing context awareness. To address the need for pedestrian information dependencies crucial for edge and small-target detection, we introduce the group multihead self-attention (G-MSA) mechanism. Overcoming high computational costs and limited interaction issues in traditional self-attention schemes, we adopt deep separation and supplementary boundary feature computation. RDConv and G-MSA are integrated into a multibranch framework to assess information flow interactions. To address the diverse requirements of activation functions for convolution and self-attention mechanisms, we propose the dynamic boundary (DB) activation function. It can adaptively adjust the nonlinearity and gradient of information from each layer in the network, accommodating the integrated structure of the two merging methods. Applied to YOLOv5s and tested on City Persons, Caltech Pedestrian, and PASCAL VOC datasets, our approach achieves significant metrics of 33.61 AP 0.5 , 61.41 AP 0.5 , and 92.08 mAP (YOLOv5m). Results across three datasets strongly affirm the effectiveness of our method.

R-SSD: Refined Single Shot Multibox Detector for Pedestrian Detection

Pedestrian Detection Method Based on Improved YOLOv5s for Densely Occluded Scenarios

Towards Accurate Dense Pedestrian Detection Via Occlusion-Prediction Aware Label Assignment and Hierarchical-Nms.

A Novel Approach to Design the Fast Pedestrian Detection for Video Surveillance System

Asymmetric multi-stage CNNs for small-scale pedestrian detection

Too Far to See? Not Really! --- Pedestrian Detection with Scale-aware Localization Policy

Pedestrian Detection Using Multi-Channel Visual Feature Fusion by Learning Deep Quality Model.

Reparameterized dilated architecture: A wider field of view for pedestrian detection

An Efficient Pedestrian Detection for Realtime Surveillance Systems Based on Modified YOLOv3

Small-Scale Pedestrian Detection Using Fusion Network and Probabilistic Loss

Pedestrian Detection Aided by Scale-Discriminative Network.

Deep Convolutional Neural Networks For Pedestrian Detection With Skip Pooling

RCSLFNet: a novel real-time pedestrian detection network based on re-parameterized convolution and channel-spatial location fusion attention for low-resolution infrared image

Robust multi-modal pedestrian detection using deep convolutional neural network with ensemble learning model

Deep Pedestrian Detection Using Contextual Information and Multi-level Features

Multi-Scale Structure Perception and Global Context-Aware Method for Small-Scale Pedestrian Detection

SSA-CNN: Semantic Self-Attention CNN for Pedestrian Detection

Multi-Grained Deep Feature Learning for Pedestrian Detection

Box-level Segmentation Supervised Deep Neural Networks for Accurate and Real-time Multispectral Pedestrian Detection

Taking a Look at Small-Scale Pedestrians and Occluded Pedestrians

Multispectral Deep Neural Networks for Pedestrian Detection