Progressive Random Convolutions for Single Domain Generalization

Seokeon Choi,Debasmit Das,Sungha Choi,Seunghan Yang,Hyunsin Park,Sungrack Yun
2023-04-02
Abstract:Single domain generalization aims to train a generalizable model with only one source domain to perform well on arbitrary unseen target domains. Image augmentation based on Random Convolutions (RandConv), consisting of one convolution layer randomly initialized for each mini-batch, enables the model to learn generalizable visual representations by distorting local textures despite its simple and lightweight structure. However, RandConv has structural limitations in that the generated image easily loses semantics as the kernel size increases, and lacks the inherent diversity of a single convolution operation. To solve the problem, we propose a Progressive Random Convolution (Pro-RandConv) method that recursively stacks random convolution layers with a small kernel size instead of increasing the kernel size. This progressive approach can not only mitigate semantic distortions by reducing the influence of pixels away from the center in the theoretical receptive field, but also create more effective virtual domains by gradually increasing the style diversity. In addition, we develop a basic random convolution layer into a random convolution block including deformable offsets and affine transformation to support texture and contrast diversification, both of which are also randomly initialized. Without complex generators or adversarial learning, we demonstrate that our simple yet effective augmentation strategy outperforms state-of-the-art methods on single domain generalization benchmarks.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The paper aims to address the problem of Single Domain Generalization, which involves training a model using data from only one source domain so that it performs well on unseen target domains. The paper proposes a method called "Progressive Random Convolutions" (Pro-RandConv) to improve the existing random convolution technique (RandConv). Specifically: 1. **Addressing the limitations of existing methods**: Traditional random convolution methods (RandConv) have two main limitations: when the convolution kernel size increases, the image tends to lose semantic information; a single convolution operation lacks inherent diversity. To address these issues, the authors propose Pro-RandConv, which incrementally stacks small-sized random convolution layers to gradually increase style diversity while preserving the semantic information of objects. 2. **Enhancing diversity and semantic preservation**: To further improve the diversity of image transformations, the authors introduce a random convolution block that includes deformable offsets and affine transformations. These operations not only support the diversification of textures and contrasts but also ensure that the newly generated styles maintain higher diversity while preserving semantic consistency. 3. **Simple and effective implementation**: Compared to adversarial learning-based or other complex methods, Pro-RandConv achieves significant performance improvements through simple image augmentation, without requiring complex training processes or additional loss functions. Experimental results show that on datasets like Digits and PACS, the Pro-RandConv method achieves significantly better performance in single domain generalization tasks compared to other existing methods, and also demonstrates strong competitiveness in multi-domain generalization tasks. This proves the effectiveness and practicality of the method.