Abstract:The state-of-the-art methods for story visualization demonstrate a significant demand for training data and storage, as well as limited flexibility in story presentation, thereby rendering them impractical for real-world applications. We introduce CogCartoon, a practical story visualization method based on pre-trained diffusion models. To alleviate dependence on data and storage, we propose an innovative strategy of character-plugin generation that can represent a specific character as a compact 316 KB plugin by using a few training samples. To facilitate enhanced flexibility, we employ a strategy of plugin-guided and layout-guided inference, enabling users to seamlessly incorporate new characters and custom layouts into the generated image results at their convenience. We have conducted comprehensive qualitative and quantitative studies, providing compelling evidence for the superiority of CogCartoon over existing methodologies. Moreover, CogCartoon demonstrates its power in tackling challenging tasks, including long story visualization and realistic style story visualization.

What problem does this paper attempt to address?

The problems that this paper attempts to solve are the two main drawbacks existing in the existing methods in the field of story visualization: 1. **Data and Storage Dependence**: The existing story visualization methods rely heavily on a large amount of training data and storage resources. For example, the commonly - used FlintstonesSV and PorotoSV datasets contain 20,132 and 10,191 training samples respectively. However, in the early stage of storybook creation, it is unrealistic to collect tens of thousands of samples. In addition, these methods need to store a separate model for each independent story, which is impractical in large - scale commercial scenarios because there are usually many independent stories. 2. **Lack of Flexibility**: The existing methods show limited flexibility in integrating new characters and controlling the layout. In practical applications, users often need to insert new characters and control the layout at any time. However, these methods are difficult to meet these requirements because they rely on data sets of specific characters for fine - tuning and lack layout control strategies. To solve the above problems, the paper proposes an innovative and practical story visualization framework - **CogCartoon**. CogCartoon overcomes the limitations of the existing methods through the following two strategies: - **Character Plug - in Generation**: By using a small number of training samples, a specific character can be represented as a compact plug - in of only 316 KB. In this way, the storage of multiple independent stories only needs to include various character plug - ins and a shared diffusion model, thus reducing the dependence on data and storage. - **Plug - in - Guided and Layout - Guided Inference**: Users can flexibly add new characters and modify character positions as needed. Specifically, when introducing a new character, the user can easily create the corresponding plug - in by providing a small number of samples, and then use the proposed inference method to generate a story illustration containing the new character by using the newly created character plug - in, the existing character plug - in and the custom layout at the same time. Through these innovative strategies, CogCartoon is not only more efficient in terms of data and storage, but also provides higher flexibility, making it more suitable for story visualization tasks in practical applications.

CogCartoon: Towards Practical Story Visualization

TaleCrafter: Interactive Story Visualization with Multiple Characters

AutoStory: Generating Diverse Storytelling Images with Minimal Human Effort

Real-time Cartoon Water Animation

Story-Adapter: A Training-free Iterative Framework for Long Story Visualization

StoryImager: A Unified and Efficient Framework for Coherent Story Visualization and Completion

A Customizable Generator for Comic-Style Visual Narrative

Novel Approaches To Computer-Assisted Cartoon Animation

Make-A-Storyboard: A General Framework for Storyboard with Disentangled and Merged Control

Interactive Cartoon Reusing by Transfer Learning

Storytelling from an Image Stream Using Scene Graphs

Evolving Storytelling: Benchmarks and Methods for New Character Customization with Diffusion Models

Cartoon Character Animation from Multi-view Hand-Drawings

Neural Storyboard Artist: Visualizing Stories with Coherent Image Sequences

Story3D-Agent: Exploring 3D Storytelling Visualization with Large Language Models

iStoryline: Effective Convergence to Hand-drawn Storylines

Content-Aware Video2Comics with Manga-Style Layout

Graphic Narrative with Interactive Stylization Design

Movie2Comics: Towards a Lively Video Content Presentation

CPST: Comprehension-Preserving Style Transfer for Multi-Modal Narratives

An Interactive Web-Based System for Creating Single Panel Cartoons with Visually Valid Compositions