Disentangled Contrastive Image Translation for Nighttime Surveillance

Guanzhou Lan,Bin Zhao,Xuelong Li
2023-07-11
Abstract:Nighttime surveillance suffers from degradation due to poor illumination and arduous human annotations. It is challengable and remains a security risk at night. Existing methods rely on multi-spectral images to perceive objects in the dark, which are troubled by low resolution and color absence. We argue that the ultimate solution for nighttime surveillance is night-to-day translation, or Night2Day, which aims to translate a surveillance scene from nighttime to the daytime while maintaining semantic consistency. To achieve this, this paper presents a Disentangled Contrastive (DiCo) learning method. Specifically, to address the poor and complex illumination in the nighttime scenes, we propose a learnable physical prior, i.e., the color invariant, which provides a stable perception of a highly dynamic night environment and can be incorporated into the learning pipeline of neural networks. Targeting the surveillance scenes, we develop a disentangled representation, which is an auxiliary pretext task that separates surveillance scenes into the foreground and background with contrastive learning. Such a strategy can extract the semantics without supervision and boost our model to achieve instance-aware translation. Finally, we incorporate all the modules above into generative adversarial networks and achieve high-fidelity translation. This paper also contributes a new surveillance dataset called NightSuR. It includes six scenes to support the study on nighttime surveillance. This dataset collects nighttime images with different properties of nighttime environments, such as flare and extreme darkness. Extensive experiments demonstrate that our method outperforms existing works significantly. The dataset and source code will be released on GitHub soon.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
### Problems Addressed by the Paper The paper aims to address the issue of image quality in nighttime surveillance. Due to poor lighting conditions and the difficulty of annotation, nighttime surveillance images often suffer from poor quality, posing security risks. Existing methods rely on multispectral images to perceive objects in the dark, but these methods usually have low resolution and lack color information. Therefore, this paper proposes a "Night2Day" method, which improves the perceptual quality of nighttime images by converting nighttime surveillance scenes into daytime scenes while maintaining the semantic consistency of the image content. Specifically, the paper proposes a method called "Disentangled Contrastive Learning" (DiCo). This method includes the following aspects: 1. **Learnable Color Invariants**: To cope with the complex lighting variations in nighttime images, color invariants based on the Kubelka-Munk theory are introduced to provide stable perceptual capabilities. 2. **Disentangled Representation**: The surveillance scene is divided into foreground and background, and semantic information is extracted through a contrastive learning strategy, achieving better visual effects and instance-aware image translation under unsupervised conditions. 3. **New Dataset**: The paper also releases a new nighttime surveillance dataset called NightSuR, which contains 6574 images across 6 different scenes, covering complex environments such as extremely low light and strong light. Through these methods, the paper achieves significantly better results than existing methods on multiple benchmarks.