Self-Supervised Keypoint Detection with Distilled Depth Keypoint Representation

Aman Anand,Elyas Rashno,Amir Eskandari,Farhana Zulkernine
2024-10-05
Abstract:Existing unsupervised keypoint detection methods apply artificial deformations to images such as masking a significant portion of images and using reconstruction of original image as a learning objective to detect keypoints. However, this approach lacks depth information in the image and often detects keypoints on the background. To address this, we propose Distill-DKP, a novel cross-modal knowledge distillation framework that leverages depth maps and RGB images for keypoint detection in a self-supervised setting. During training, Distill-DKP extracts embedding-level knowledge from a depth-based teacher model to guide an image-based student model with inference restricted to the student. Experiments show that Distill-DKP significantly outperforms previous unsupervised methods by reducing mean L2 error by 47.15% on Human3.6M, mean average error by 5.67% on Taichi, and improving keypoints accuracy by 1.3% on DeepFashion dataset. Detailed ablation studies demonstrate the sensitivity of knowledge distillation across different layers of the network. Project Page: <a class="link-external link-https" href="https://23wm13.github.io/distill-dkp/" rel="external noopener nofollow">this https URL</a>
Computer Vision and Pattern Recognition,Artificial Intelligence
What problem does this paper attempt to address?