Temporally Consistent Enhancement of Low-Light Videos via Spatial-Temporal Compatible Learning

Lingyu Zhu,Wenhan Yang,Baoliang Chen,Hanwei Zhu,Xiandong Meng,Shiqi Wang

DOI: https://doi.org/10.1007/s11263-024-02084-w

IF: 13.369

2024-05-24

International Journal of Computer Vision

Abstract:Temporal inconsistency is the annoying artifact that has been commonly introduced in low-light video enhancement, but current methods tend to overlook the significance of utilizing both data-centric clues and model-centric design to tackle this problem. In this context, our work makes a comprehensive exploration from the following three aspects. First, to enrich the scene diversity and motion flexibility, we construct a synthetic diverse low/normal-light paired video dataset with a carefully designed low-light simulation strategy, which can effectively complement existing real captured datasets. Second, for better temporal dependency utilization, we develop a Temporally Consistent Enhancer Network (TCE-Net) that consists of stacked 3D convolutions and 2D convolutions to exploit spatial-temporal clues in videos. Last, the temporal dynamic feature dependencies are exploited to obtain consistency constraints for different frame indexes. All these efforts are powered by a Spatial-Temporal Compatible Learning (STCL) optimization technique, which dynamically constructs specific training loss functions adaptively on different datasets. As such, multiple-frame information can be effectively utilized and different levels of information from the network can be feasibly integrated, thus expanding the synergies on different kinds of data and offering visually better results in terms of illumination distribution, color consistency, texture details, and temporal coherence. Extensive experimental results on various real-world low-light video datasets clearly demonstrate the proposed method achieves superior performance to state-of-the-art methods. Our code and synthesized low-light video database will be publicly available at https://github.com/lingyzhu0101/low-light-video-enhancement.git.

computer science, artificial intelligence

What problem does this paper attempt to address?

The problem that this paper attempts to address is temporal inconsistency in low-light video enhancement. Specifically, current methods often neglect the importance of leveraging data-centric cues and model design to solve this issue when enhancing low-light videos. This results in visual artifacts such as flickering in the enhanced videos over time, which affects the viewing experience and the performance of downstream computer vision tasks. To tackle this challenge, the paper comprehensively explores the following three aspects: 1. **Dataset Construction**: To enrich scene diversity and motion flexibility, the authors constructed a synthetic low-light/normal-light paired video dataset and adopted a carefully designed low-light simulation strategy to effectively supplement existing real-shot datasets. 2. **Model Design**: To better utilize temporal dependencies, the authors developed a Temporal Consistency Enhancement Network (TCE-Net), which consists of stacked 3D convolutions and 2D convolutions to extract spatiotemporal cues in videos. 3. **Training Mechanism**: By dynamically constructing specific training loss functions, a Spatio-Temporal Compatible Learning (STCL) optimization technique was proposed, enabling effective utilization of multi-frame information and fusion of information at different network levels across various datasets. These efforts aim to expand the synergistic effects of different types of data and provide better visual effects in terms of illumination distribution, color consistency, texture details, and temporal coherence. Experimental results show that this method outperforms existing methods on various real-world low-light video datasets.

Temporally Consistent Enhancement of Low-Light Videos via Spatial-Temporal Compatible Learning

STARNet: Low-light Video Enhancement Using Spatio-Temporal Consistency Aggregation

Low-Light Video Enhancement via Spatial-Temporal Consistent Illumination and Reflection Decomposition

Low-Light Video Enhancement with Synthetic Event Guidance

Unrolled Decomposed Unpaired Learning for Controllable Low-Light Video Enhancement

Mutual Support and Promotion: Learning Structure Compensation and Context Completion for Low-Light Vision

LVE-S2D: Low-Light Video Enhancement From Static to Dynamic

Learning Deep Context-Sensitive Decomposition for Low-Light Image Enhancement

Low-Light Image Enhancement by Learning Contrastive Representations in Spatial and Frequency Domains

Low-Light Image Enhancement with Multi-Scale Attention and Frequency-Domain Optimization

LEDNet: Joint Low-light Enhancement and Deblurring in the Dark

Division gets better: Learning brightness-aware and detail-sensitive representations for low-light image enhancement

Enlightening Low-Light Images With Dynamic Guidance for Context Enrichment

Seeing Dark Videos Via Self-Learned Bottleneck Neural Representation

A Spatio-temporal Aligned SUNet Model for Low-light Video Enhancement

EvLight++: Low-Light Video Enhancement with an Event Camera: A Large-Scale Real-World Dataset, Novel Method, and More

Coherent Event Guided Low-Light Video Enhancement

Learning an Adaptive Model for Extreme Low-light Raw Image Processing

Attention Guided Low-Light Image Enhancement with a Large Scale Low-Light Simulation Dataset

Exploiting Temporal Consistency for Real-Time Video Depth Estimation

DiffLLE: Diffusion-based Domain Calibration for Weak Supervised Low-light Image Enhancement