Diffusion Model-Based Video Editing: A Survey

Wenhao Sun,Rong-Cheng Tu,Jingyi Liao,Dacheng Tao
2024-06-26
Abstract:The rapid development of diffusion models (DMs) has significantly advanced image and video applications, making "what you want is what you see" a reality. Among these, video editing has gained substantial attention and seen a swift rise in research activity, necessitating a comprehensive and systematic review of the existing literature. This paper reviews diffusion model-based video editing techniques, including theoretical foundations and practical applications. We begin by overviewing the mathematical formulation and image domain's key methods. Subsequently, we categorize video editing approaches by the inherent connections of their core technologies, depicting evolutionary trajectory. This paper also dives into novel applications, including point-based editing and pose-guided human video editing. Additionally, we present a comprehensive comparison using our newly introduced V2VBench. Building on the progress achieved to date, the paper concludes with ongoing challenges and potential directions for future research.
Computer Vision and Pattern Recognition,Artificial Intelligence,Machine Learning,Multimedia
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the application and development of diffusion models in the field of video editing. Specifically, the paper aims to provide a comprehensive and systematic review, covering the theoretical basis and practical applications of video - editing techniques based on diffusion models. With the rapid development of Diffusion Models (DMs), these techniques have made significant progress in image and video applications, making the concept of "what you see is what you get" a reality. Especially in video editing, the application of diffusion models has attracted wide attention, and research activities have increased rapidly, so a detailed review and analysis of the existing literature are required. The paper mainly focuses on the following aspects: 1. **Theoretical Basis**: Outline the mathematical formulas of diffusion models and the key methods in the image field. 2. **Technical Classification**: Classify video - editing methods according to the internal relations of the core technologies and depict their evolution trajectories. 3. **New Applications**: Explore new applications such as point - based editing and pose - guided human video editing. 4. **Benchmark Testing**: Introduce a new benchmarking tool, V2VBench, for comprehensively comparing different video - editing methods. 5. **Future Directions**: Summarize the current challenges and propose potential directions for future research. Through these contents, the paper not only deepens the understanding of the current application of diffusion models in the video editing field but also provides valuable references and guidance for future scientific research work.