VRDistill: Vote Refinement Distillation for Efficient Indoor 3D Object Detection

Ze Yuan,Jinyang Guo,Dakai An,Junran Wu,He Zhu,Jianhao Li,Xueyuan Chen,Ke Xu,Jiaheng Liu
DOI: https://doi.org/10.1145/3664647.3681121
2024-01-01
Abstract:Recently, indoor 3D object detection has shown impressive progress. However, these improvements have come at the cost of increased memory consumption and longer inference times, making it difficult to apply these methods in practical scenarios. To address this issue, knowledge distillation has emerged as a promising technique for model acceleration. In this paper, we propose the VRDistill framework, the first knowledge distillation framework designed for efficient indoor 3D object detection. Our VRDistill framework includes a refinement module and a soft foreground mask operation to enhance the quality of the distillation. The refinement module utilizes trainable layers to improve the quality of the teacher's votes, while the soft foreground mask operation focuses on foreground votes, further enhancing the distillation performance. Comprehensive experiments on the ScanNet and SUN-RGBD datasets demonstrate the effectiveness and generalization ability of our VRDistill framework.
What problem does this paper attempt to address?