Deep3DRanker: A Novel Framework for Learning to Rank 3D Models with Self-Attention in Robotic Vision

Frank Po Wen Lo,Yao Guo,Yingnan Sun,Jianing Qiu,Benny Lo
DOI: https://doi.org/10.1109/icra48506.2021.9561732
2021-01-01
Abstract:Research on generating or processing point clouds has become an increasingly popular domain in robotic research due to its extensive applications, such as robotic grasping, augmented reality and autonomous vehicle navigation. In this paper, we explore a new research area on point clouds - Learning to rank 3D models captured from a single depth image. In the Learning To Rank (LTR) task, we aim at optimizing the order of a list of 3D models according to the given query. Inspired by the recent advances in Natural Language Processing (NLP), we propose a novel framework, namely Deep3DRanker, for ranking 3D models by leveraging graph-based encoding and self-attention mechanisms. Comprehensive experiments are conducted to validate our methods on publicly available YCB synthetic and YCB video datasets. The promising results have shown that our proposed framework is generic enough to be applicable with any combinations of randomly positioned, oriented, and unseen object items with accuracy ranging from 59.2% to 94.9%, which shows great potentials of the proposed framework for robotic applications, in particular, for making decisions under different circumstances.
What problem does this paper attempt to address?