MVM3Det: A Novel Method for Multi-view Monocular 3D Detection

Li Haoran,Duan Zicheng,Ma Mingjun,Chen Yaran,Li Jiaqi,Zhao Dongbin
DOI: https://doi.org/10.48550/arXiv.2109.10473
2021-09-22
Abstract:Monocular 3D object detection encounters occlusion problems in many application scenarios, such as traffic monitoring, pedestrian monitoring, etc., which leads to serious false negative. Multi-view object detection effectively solves this problem by combining data from different perspectives. However, due to label confusion and feature confusion, the orientation estimation of multi-view 3D object detection is intractable, which is important for object tracking and intention prediction. In this paper, we propose a novel multi-view 3D object detection method named MVM3Det which simultaneously estimates the 3D position and orientation of the object according to the multi-view monocular information. The method consists of two parts: 1) Position proposal network, which integrates the features from different perspectives into consistent global features through feature orthogonal transformation to estimate the position. 2) Multi-branch orientation estimation network, which introduces feature perspective pooling to overcome the two confusion problems during the orientation estimation. In addition, we present a first dataset for multi-view 3D object detection named MVM3D. Comparing with State-Of-The-Art (SOTA) methods on our dataset and public dataset WildTrack, our method achieves very competitive results.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?