Image‐level Dataset Synthesis with an End‐to‐end Trainable Framework

Zhenfeng Xue,Weijie Mao,Yong Liu
DOI: https://doi.org/10.1049/ipr2.12486
IF: 2.3
2022-01-01
IET Image Processing
Abstract:Dataset synthesis via virtual engines like Unity is attracting much more attention in recent years due to its low cost at obtaining ground-truth labels. For this kind of work, virtual environments are constructed within the engine to mimic the real-world, either with great manual efforts or learning-based methods. The latter shows superiority over the former when the target real-world scenes are changeable, from which the attributes of environments can be automatically adjusted based on the distribution difference between the synthetic and real-world datasets. However, the non-differentiability of whole pipeline hinders the efficiency of attribute optimization. To this end, this paper proposes to simulate synthetic datasets from a fine-grained perspective, such that the system can be trained at an end-to-end manner. Specifically, it is converted into an image-level data synthesis problem, and designs a constraint using the content loss between two images. As the rendering process of virtual engine is mathematically unknown, which blocks the back propagation of the gradients, a generative model is trained to approximate the engine. As a result, the whole framework becomes fully differentiable and the attributes can be optimized efficiently by gradient descent. Experimental result shows the efficiency of our method in obtaining useful synthetic training datasets. Besides, it is found that the image-level method enables to learn the potential distribution of real-world data, which is hard to be achieved by existing methods. As far as we know, it is the first attempt to finish this task with a differentiable process.
What problem does this paper attempt to address?