Abstract:This paper aims at developing a faster and more accurate solution to the amodal 3D object detection problem for indoor scenarios. The solution is achieved through a novel neural network structure which takes a pair of RGB-D images as input and delivers oriented 3D bounding boxes as the output. Such network, named 3D-SSD, has two components: hierarchical feature fusion and multi-layer prediction. The hierarchical feature fusion combines multi-scale appearance and geometric features learned from RGB-D images, which is later utilized in the multi-layer prediction for object detection. Both the accuracy and the efficiency can be improved by exploiting 2.5D representations in a synergistic way. To specifically address the shape variance of different objects, a set of 3D anchor boxes with varying physical sizes are attached to every location on the prediction layers. While testing, the category scores for 3D anchor boxes are generated with adjusted positions, sizes and orientations, leading to the final detections using non-maximum suppression. Comprehensive experiments have been performed on publicly accessible dataset of SUN RGB-D and NYUV2. The results show the proposed algorithm is the first 3D detector that runs in near real-time on the challenging datasets with competitive performance to the state-of-the-art methods. The 3D-SSD gets 37.1% mAP on the SUN RGB-D dataset at around 5.6 fps, which outperforms the state-of-the-art Deep Sliding Shape by 10.2% mAP and around 109 x faster. For an efficient model setting with a rate of 9.3 fps, 3D-SSD still gets an accuracy of 37% on mAP. Further, experiments also suggest the proposed approach achieves comparable accuracy and is about 477 x faster than the state-of-art method on the NYUv2 dataset even with a smaller input image size. (C) 2019 Published by Elsevier B.V.

CAGroup3D: Class-Aware Grouping for 3D Object Detection on Point Clouds

3D-SSD: Learning Hierarchical Features from RGB-D Images for Amodal 3D Object Detection

SPGroup3D: Superpoint Grouping Network for Indoor 3D Object Detection

From Points to Parts: 3D Object Detection from Point Cloud with Part-aware and Part-aggregation Network

AGO-Net: Association-Guided 3D Point Cloud Object Detection Network

A Hierarchical Graph Network for 3D Object Detection on Point Clouds

REGNet: Ray-Based Enhancement Grouping for 3D Object Detection Based on Point Cloud

An Efficient Ungrouped Mask Method with Two Learnable Parameters for 3D Object Detection

Class-balanced Grouping and Sampling for Point Cloud 3D Object Detection

Three-Dimensional Point Cloud Object Detection Based on Feature Fusion and Enhancement

Structure Aware Single-Stage 3D Object Detection From Point Cloud

RBGNet: Ray-based Grouping for 3D Object Detection

Anchor-free 3D Single Stage Detector with Mask-Guided Attention for Point Cloud

Semantic-Context Graph Network for Point-based 3D Object Detection

KDA3D: Key-Point Densification and Multi-Attention Guidance for 3D Object Detection

PointRCNN: 3D Object Proposal Generation and Detection from Point Cloud

MS23D: A 3D object detection method using multi-scale semantic feature points to construct 3D feature layer

Context-Aware Dynamic Feature Extraction for 3D Object Detection in Point Clouds

ODSPC: deep learning-based 3D object detection using semantic point cloud

SGCCNet: Single-Stage 3D Object Detector With Saliency-Guided Data Augmentation and Confidence Correction Mechanism

From Multi-View to Hollow-3D: Hallucinated Hollow-3D R-CNN for 3D Object Detection