Learning Scene-Specific Object Detectors Based on a Generative-Discriminative Model with Minimal Supervision

Dapeng Luo,Siyuan Lei,Peng Guo,Changxin Gao,Ying Chen,Jinsheng Li,Longsheng Wei
DOI: https://doi.org/10.1016/j.patrec.2022.05.007
IF: 4.757
2022-01-01
Pattern Recognition Letters
Abstract:One object class may show large variations due to diverse illuminations, backgrounds, and camera view-points in the multi-scene object detection task. Traditional object detection methods generally perform poorly under unconstrained video environments. To address this problem, many modern approaches provide deep hierarchical appearance representations for object detection. Most of these methods require time-consuming training procedures on large manually annotated sample sets. In this paper, we propose a self-learning object detection framework to resolve the multi-scene detection problem in a bottom-up manner. A scene-specific objector is obtained from an autonomous learning process triggered by marking several bounding boxes around an object in the first video frame via a mouse. Here, artificially labeled training data or generic detectors are not needed. This learning process is conveniently replicated many times in different surveillance scenarios and produces scene-specific detectors from various camera view-points. Obviously, the initial scene-specific detector, initialized by several bounding boxes, exhibits poor detection performance and is difficult to be improved by traditional online learning algorithms. Consequently, we propose the Generative-Discriminative model (GDM) based detection method to partition detection response space and assign each partition an individual descriptor that progressively achieves high classification accuracy. Online gradual optimization process is proposed to optimize the Generative-Discriminative model and focus on those hard samples lying near the decision boundary. Experimental results on nine video datasets show that our approach achieves comparable performance to that of robust supervised methods, and outperforms state-of-the-art scene-specific object detection methods under varying imaging conditions. (C) 2022 Elsevier B.V. All rights reserved.
What problem does this paper attempt to address?