Reversible Decoupling Network for Single Image Reflection Removal

Hao Zhao,Mingjia Li,Qiming Hu,Xiaojie Guo
2024-10-10
Abstract:Recent deep-learning-based approaches to single-image reflection removal have shown promising advances, primarily for two reasons: 1) the utilization of recognition-pretrained features as inputs, and 2) the design of dual-stream interaction networks. However, according to the Information Bottleneck principle, high-level semantic clues tend to be compressed or discarded during layer-by-layer propagation. Additionally, interactions in dual-stream networks follow a fixed pattern across different layers, limiting overall performance. To address these limitations, we propose a novel architecture called Reversible Decoupling Network (RDNet), which employs a reversible encoder to secure valuable information while flexibly decoupling transmission- and reflection-relevant features during the forward pass. Furthermore, we customize a transmission-rate-aware prompt generator to dynamically calibrate features, further boosting performance. Extensive experiments demonstrate the superiority of RDNet over existing SOTA methods on five widely-adopted benchmark datasets. Our code will be made publicly available.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
This paper attempts to solve two main problems in Single Image Reflection Removal (SIRR): 1. **Information Bottleneck Problem**: In existing deep - learning methods, high - level semantic cues are often lost or compressed during layer - by - layer propagation. According to the Information Bottleneck principle, this information loss will limit the performance of the model. 2. **Fixed - Pattern Interaction Problem**: The interaction patterns in the two - stream interaction network are fixed between different layers, which will also limit the overall performance. Especially when dealing with complex real - world scenarios, this fixed pattern is difficult to adapt to the changeable reflection parameters. To solve these problems, the author proposes a new architecture named **Reversible Decoupling Network (RDNet)**. The main innovations of RDNet include: - **Multi - Column Reversible Encoder (MCRE)**: By introducing a multi - column design based on the part - whole hierarchical structure, it ensures the effective transfer and retention of information at different scales. MCRE uses reversible units to avoid information loss and realizes cross - scale feature fusion through a two - way interaction mechanism. - **Transmission - rate - aware Prompt Generator (TAPG)**: In order to deal with the change of reflection intensity in the real world, TAPG learns the channel - level transmittance - reflectance ratio prior from the data and uses this prior knowledge to guide the decomposition network to select a more accurate transmittance - reflectance ratio at test time, thereby significantly improving the generalization ability of the model. ### Formula Explanation Some of the key formulas involved in the paper are as follows: - The basic model of reflection removal: \[ I = T + R \] where \(I\) is the input image, \(T\) is the transmission layer, and \(R\) is the reflection layer. - The forward and reverse processes of the reversible unit: \[ \begin{aligned} &\text{Forward process:} \\ &\hat{T}_2 := \hat{T}_1 + F(\hat{R}_1) \\ &\hat{R}_2 := \hat{R}_1 + G(\hat{T}_2) \\ &\text{Reverse process:} \\ &\hat{T}_1 := \hat{T}_2 - F(\hat{R}_1) \\ &\hat{R}_1 := \hat{R}_2 - G(\hat{T}_2) \end{aligned} \] where \(F(\cdot)\) and \(G(\cdot)\) are arbitrary network modules, and the subscripts represent different versions of layer estimates. - Content loss function: \[ L_{\text{cont}} := c_0 \| \hat{T} - T \|_2^2 + c_1 \| \hat{R} - R \|_2^2 + c_2 \| \nabla \hat{T} - \nabla T \|_1 \] - Perceptual loss function: \[ L_{\text{per}} := \sum_j \omega_j \| \phi_j(\hat{T}) - \phi_j(T) \|_1 \] - Total loss function: \[ L := L_{\text{cont}} + w L_{\text{per}} \] Through these innovative designs, RDNet has achieved results superior to the existing state - of - the - art methods on five widely used benchmark datasets, demonstrating its superior performance in the single - image reflection removal task.