Cycle contrastive adversarial learning with structural consistency for unsupervised high-quality image deraining transformer

Chen Zhao,Weiling Cai,Chengwei Hu,Zheng Yuan
DOI: https://doi.org/10.1016/j.neunet.2024.106428
Abstract:In overcoming the challenges faced in adapting to paired real-world data, recent unsupervised single image deraining (SID) methods have proven capable of accomplishing notably acceptable deraining performance. However, the previous methods usually fail to produce a high quality rain-free image due to neglecting sufficient attention to semantic representation and the image content, which results in the inability to completely separate the content from the rain layer. In this paper, we develop a novel cycle contrastive adversarial framework for unsupervised SID, which mainly consists of cycle contrastive learning (CCL) and location contrastive learning (LCL). Specifically, CCL achieves high-quality image reconstruction and rain-layer stripping by pulling similar features together while pushing dissimilar features further in both semantic and discriminant latent spaces. Meanwhile, LCL implicitly constrains the mutual information of the same location of different exemplars to maintain the content information. In addition, recently inspired by the powerful Segment Anything Model (SAM) that can effectively extract widely applicable semantic structural details, we formulate a structural-consistency regularization to fine-tune our network using SAM. Apart from this, we attempt to introduce vision transformer (VIT) into our network architecture to further improve the performance. In our designed transformer-based GAN, to obtain a stronger representation, we propose a multi-layer channel compression attention module (MCCAM) to extract a richer feature. Equipped with the above techniques, our proposed unsupervised SID algorithm, called CCLformer, can show advantageous image deraining performance. Extensive experiments demonstrate both the superiority of our method and the effectiveness of each module in CCLformer. The code is available at https://github.com/zhihefang/CCLGAN.
What problem does this paper attempt to address?