Disentangled Contrastive Image Translation for Nighttime Surveillance

Guanzhou Lan,Bin Zhao,Xuelong Li

2023-07-11

Abstract:Nighttime surveillance suffers from degradation due to poor illumination and arduous human annotations. It is challengable and remains a security risk at night. Existing methods rely on multi-spectral images to perceive objects in the dark, which are troubled by low resolution and color absence. We argue that the ultimate solution for nighttime surveillance is night-to-day translation, or Night2Day, which aims to translate a surveillance scene from nighttime to the daytime while maintaining semantic consistency. To achieve this, this paper presents a Disentangled Contrastive (DiCo) learning method. Specifically, to address the poor and complex illumination in the nighttime scenes, we propose a learnable physical prior, i.e., the color invariant, which provides a stable perception of a highly dynamic night environment and can be incorporated into the learning pipeline of neural networks. Targeting the surveillance scenes, we develop a disentangled representation, which is an auxiliary pretext task that separates surveillance scenes into the foreground and background with contrastive learning. Such a strategy can extract the semantics without supervision and boost our model to achieve instance-aware translation. Finally, we incorporate all the modules above into generative adversarial networks and achieve high-fidelity translation. This paper also contributes a new surveillance dataset called NightSuR. It includes six scenes to support the study on nighttime surveillance. This dataset collects nighttime images with different properties of nighttime environments, such as flare and extreme darkness. Extensive experiments demonstrate that our method outperforms existing works significantly. The dataset and source code will be released on GitHub soon.

Computer Vision and Pattern Recognition

What problem does this paper attempt to address?

### Problems Addressed by the Paper The paper aims to address the issue of image quality in nighttime surveillance. Due to poor lighting conditions and the difficulty of annotation, nighttime surveillance images often suffer from poor quality, posing security risks. Existing methods rely on multispectral images to perceive objects in the dark, but these methods usually have low resolution and lack color information. Therefore, this paper proposes a "Night2Day" method, which improves the perceptual quality of nighttime images by converting nighttime surveillance scenes into daytime scenes while maintaining the semantic consistency of the image content. Specifically, the paper proposes a method called "Disentangled Contrastive Learning" (DiCo). This method includes the following aspects: 1. **Learnable Color Invariants**: To cope with the complex lighting variations in nighttime images, color invariants based on the Kubelka-Munk theory are introduced to provide stable perceptual capabilities. 2. **Disentangled Representation**: The surveillance scene is divided into foreground and background, and semantic information is extracted through a contrastive learning strategy, achieving better visual effects and instance-aware image translation under unsupervised conditions. 3. **New Dataset**: The paper also releases a new nighttime surveillance dataset called NightSuR, which contains 6574 images across 6 different scenes, covering complex environments such as extremely low light and strong light. Through these methods, the paper achieves significantly better results than existing methods on multiple benchmarks.

Disentangled Contrastive Image Translation for Nighttime Surveillance

See Clearer at Night: Towards Robust Nighttime Semantic Segmentation Through Day-Night Image Conversion

Night-to-Day Translation via Illumination Degradation Disentanglement

Let There be Light: Improved Traffic Surveillance via Detail Preserving Night-to-Day Transfer

Cross-Domain Correlation Distillation for Unsupervised Domain Adaptation in Nighttime Semantic Segmentation

Semantic and Geometric-Aware Day-to-Night Image Translation Network

Nighttime Thermal Infrared Image Translation Integrating Visible Images

Image Segmentation for Night-Vision Surveillance Camera Based on Deep Learning

Nighttime Thermal Infrared Image Colorization with Feedback-based Object Appearance Learning

Mutual Support and Promotion: Learning Structure Compensation and Context Completion for Low-Light Vision

Self-supervised Monocular Depth Estimation for All Day Images Using Domain Separation

Night-to-Day Image Translation for Retrieval-based Localization

Towards Robust Event-based Networks for Nighttime via Unpaired Day-to-Night Event Translation

Seeing Through Darkness: Visual Localization at Night Via Weakly Supervised Learning of Domain Invariant Features

STEPS: Joint Self-supervised Nighttime Image Enhancement and Depth Estimation.

Cooperative Students: Navigating Unsupervised Domain Adaptation in Nighttime Object Detection

Similarity Min-Max: Zero-Shot Day-Night Domain Adaptation.

Improving Panoptic Segmentation for Nighttime or Low-Illumination Urban Driving Scenes

Bi-Mix: Bidirectional Mixing for Domain Adaptive Nighttime Semantic Segmentation

Towards Dynamic and Small Objects Refinement for Unsupervised Domain Adaptative Nighttime Semantic Segmentation

Spectral normalization and dual contrastive regularization for image-to-image translation