Multi‐style cartoonization: Leveraging multiple datasets with generative adversarial networks

Jianlu Cai,Frederick W. B. Li,Fangzhe Nan,Bailin Yang
DOI: https://doi.org/10.1002/cav.2269
IF: 1.01
2024-05-19
Computer Animation and Virtual Worlds
Abstract:We introduce a multi‐style scene cartoonization GAN aiming to enhance the technique of photo‐to‐cartoon conversion. By amalgamating multiple cartoon datasets and employing innovative encoding methods, our model achieves more realistic and abstract cartoon effects, surpassing previous approaches. By capturing relationships between datasets, we can provide high‐quality cartoon images without the need for tedious iterative retraining, marking a subtle but significant advancement in the field. Scene cartoonization aims to convert photos into stylized cartoons. While generative adversarial networks (GANs) can generate high‐quality images, previous methods focus on individual images or single styles, ignoring relationships between datasets. We propose a novel multi‐style scene cartoonization GAN that leverages multiple cartoon datasets jointly. Our main technical contribution is a multi‐branch style encoder that disentangles representations to model styles as distributions over entire datasets rather than images. Combined with a multi‐task discriminator and perceptual losses optimizing across collections, our model achieves state‐of‐the‐art diverse stylization while preserving semantics. Experiments demonstrate that by learning from inter‐dataset relationships, our method translates photos into cartoon images with improved realism and abstraction fidelity compared to prior arts, without iterative re‐training for new styles.
computer science, software engineering
What problem does this paper attempt to address?