RHRSegNet: Relighting High-Resolution Night-Time Semantic Segmentation

Sarah Elmahdy,Rodaina Hebishy,Ali Hamdi
2024-07-08
Abstract:Night time semantic segmentation is a crucial task in computer vision, focusing on accurately classifying and segmenting objects in low-light conditions. Unlike daytime techniques, which often perform worse in nighttime scenes, it is essential for autonomous driving due to insufficient lighting, low illumination, dynamic lighting, shadow effects, and reduced contrast. We propose RHRSegNet, implementing a relighting model over a High-Resolution Network for semantic segmentation. RHRSegNet implements residual convolutional feature learning to handle complex lighting conditions. Our model then feeds the lightened scene feature maps into a high-resolution network for scene segmentation. The network consists of a convolutional producing feature maps with varying resolutions, achieving different levels of resolution through down-sampling and up-sampling. Large nighttime datasets are used for training and evaluation, such as NightCity, City-Scape, and Dark-Zurich datasets. Our proposed model increases the HRnet segmentation performance by 5% in low-light or nighttime images.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The paper mainly addresses the issue of semantic segmentation in nighttime environments and proposes a new solution—RHRSegNet (Relighting High-Resolution Night-Time Semantic Segmentation Network). Nighttime semantic segmentation is a crucial task in computer vision, especially in fields like autonomous driving, where it is necessary to accurately classify and segment objects in images under complex conditions such as insufficient lighting, low brightness, dynamic lighting, shadow effects, and reduced contrast. ### Main Issues and Solutions - **Issue**: Most existing semantic segmentation techniques perform poorly under nighttime or low-light conditions. This is primarily due to decreased visibility and increased noise in these conditions, making it difficult for existing methods to maintain high accuracy. - **Solutions**: - A RHRSegNet model combining a relighting model and a high-resolution network is proposed to handle complex lighting conditions in nighttime scenes. - Residual convolutional feature learning is used to address complex lighting variations. - The relighted scene feature maps are fed into a high-resolution network for scene segmentation, which achieves different levels of resolution through downsampling and upsampling. - Training and evaluation are conducted on large nighttime datasets such as NightCity, Cityscapes, and Dark Zurich. - Cross-dataset knowledge transfer improves the segmentation performance of HRNet on low-light or nighttime images by approximately 5%. ### Technical Highlights - **Relighting Model**: Simulates lighting changes to enhance the brightness and contrast of images, improving the quality of nighttime images. - **Shared Weight Learning**: Optimizes the efficiency of learning parameters, ensures consistent feature extraction, and enhances the model's generalization ability. - **High-Resolution Network**: Utilizes a high-resolution network architecture to maintain high-resolution representations and capture multi-scale contextual information. - **Data Augmentation**: Enhances the training dataset through operations such as cropping, scaling, and flipping, improving the model's generalization ability. ### Experimental Results - On the Dark Zurich dataset, RHRSegNet achieved the highest performance in mean Intersection over Union (mIOU) at 27.19%, compared to models like AdaptSegNet, GCMA, and MGCDA. - On the Cityscapes dataset, RHRSegNet also performed excellently, achieving 37.53% mIOU, significantly surpassing other competing models. - Comparisons with various models validate the superior performance of RHRSegNet in handling complex and challenging conditions. In summary, RHRSegNet is an effective model specifically designed for nighttime semantic segmentation, capable of accurately identifying and segmenting objects in images under complex lighting conditions, which is of great significance for applications such as autonomous driving.