A Method for Training Object Scale Estimation System using Feature Extraction Enhancement with Depth Estimation

Kyungryun Kim
DOI: https://doi.org/10.47611/jsrhs.v12i4.5144
2023-11-30
Journal of Student Research
Abstract:In recent years, machine learning-based object scale estimation has been growing in popularity, as the significance of the technology lies in its potential for use in many industry fields. Although several methods have been proposed, the possible applications of this technique are limited due to its insufficient accuracy. Hence, a human-level accurate system is needed for the technology to be applied in the real-world domain. This research paper proposes a novel object scale estimation system that incorporates the feature extractor, disentangled feature maps, depth estimator, object localizer, and ground truth depth map. The input of the proposed system is an image, which is inputted into the feature extractor to create disentangled feature maps. These feature maps are then extracted by the depth estimator to generate a depth map, and by the object localizer to create a predicted bounding box around the object. The trained feature extractor can extract disentangled size-related features from the inputted image by jointly training the depth estimator and object localizer. The use of disentangled features boosts the performance of the proposed system. In addition, we propose an actual scale converter module to calculate the actual size of the inputted object. Throughout the experiments, the proposed method has proven that it is superior compared to other state-of-the-art methods. The proposed method achieves an IoU (Intersection over Union) value of 0.8113 on the COCO dataset.
What problem does this paper attempt to address?