Zero-Shot Scene Change Detection

Kyusik Cho,Dong Yeop Kim,Euntai Kim
2024-06-17
Abstract:We present a novel, training-free approach to scene change detection. Our method leverages tracking models, which inherently perform change detection between consecutive frames of video by identifying common objects and detecting new or missing objects. Specifically, our method takes advantage of the change detection effect of the tracking model by inputting reference and query images instead of consecutive frames. Furthermore, we focus on the content gap and style gap between two input images in change detection, and address both issues by proposing adaptive content threshold and style bridging layers, respectively. Finally, we extend our approach to video to exploit rich temporal information, enhancing scene change detection performance. We compare our approach and baseline through various experiments. While existing train-based baseline tend to specialize only in the trained domain, our method shows consistent performance across various domains, proving the competitiveness of our approach.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
This paper proposes a new method for solving the Scene Change Detection (SCD) problem, called Zero-Shot SCD, without the need for training data. Traditional SCD methods rely on deep learning and require a large amount of training data, making them vulnerable to changes in image style. The new method utilizes a tracking model to detect changes by identifying common objects and newly appeared or disappeared objects in consecutive frames. Specifically, the paper introduces the following innovations: 1. The SCD problem is transformed into a tracking problem, where reference and query images are inputted into the tracking model instead of consecutive frames. 2. Adaptive content threshold and style bridging layers are proposed to address the content gap and style gap issues. 3. The method is extended to video sequences to enhance SCD performance using rich temporal information. The paper points out that existing training-based methods may perform well only in the training domain, while the proposed Zero-Shot SCD method demonstrates better consistency across different domains, proving its competitiveness. Experimental results show that this method achieves comparable or even superior performance compared to training-based methods on multiple benchmark datasets, especially in cross-domain scene change detection.