Chasing Day and Night: Towards Robust and Efficient All-Day Object Detection Guided by an Event Camera

Jiahang Cao,Xu Zheng,Yuanhuiyi Lyu,Jiaxu Wang,Renjing Xu,Lin Wang
DOI: https://doi.org/10.48550/arXiv.2309.09297
2024-03-19
Abstract:The ability to detect objects in all lighting (i.e., normal-, over-, and under-exposed) conditions is crucial for real-world applications, such as <a class="link-external link-http" href="http://self-driving.Traditional" rel="external noopener nofollow">this http URL</a> RGB-based detectors often fail under such varying lighting <a class="link-external link-http" href="http://conditions.Therefore" rel="external noopener nofollow">this http URL</a>, recent works utilize novel event cameras to supplement or guide the RGB modality; however, these methods typically adopt asymmetric network structures that rely predominantly on the RGB modality, resulting in limited robustness for all-day detection. In this paper, we propose EOLO, a novel object detection framework that achieves robust and efficient all-day detection by fusing both RGB and event modalities. Our EOLO framework is built based on a lightweight spiking neural network (SNN) to efficiently leverage the asynchronous property of events. Buttressed by it, we first introduce an Event Temporal Attention (ETA) module to learn the high temporal information from events while preserving crucial edge information. Secondly, as different modalities exhibit varying levels of importance under diverse lighting conditions, we propose a novel Symmetric RGB-Event Fusion (SREF) module to effectively fuse RGB-Event features without relying on a specific modality, thus ensuring a balanced and adaptive fusion for all-day detection. In addition, to compensate for the lack of paired RGB-Event datasets for all-day training and evaluation, we propose an event synthesis approach based on the randomized optical flow that allows for directly generating the event frame from a single exposure image. We further build two new datasets, E-MSCOCO and E-VOC based on the popular benchmarks MSCOCO and PASCAL VOC. Extensive experiments demonstrate that our EOLO outperforms the state-of-the-art detectors,e.g.,RENet,by a substantial margin (+3.74% mAP50) in all lighting <a class="link-external link-http" href="http://conditions.Our" rel="external noopener nofollow">this http URL</a> code and datasets will be available at <a class="link-external link-https" href="https://vlislab22.github.io/EOLO/" rel="external noopener nofollow">this https URL</a>
Computer Vision and Pattern Recognition,Robotics
What problem does this paper attempt to address?