A Survey on Long Video Generation: Challenges, Methods, and Prospects

Chengxuan Li,Di Huang,Zeyu Lu,Yang Xiao,Qingqi Pei,Lei Bai
2024-03-25
Abstract:Video generation is a rapidly advancing research area, garnering significant attention due to its broad range of applications. One critical aspect of this field is the generation of long-duration videos, which presents unique challenges and opportunities. This paper presents the first survey of recent advancements in long video generation and summarises them into two key paradigms: divide and conquer temporal autoregressive.
Computer Science
What problem does this paper attempt to address?
The paper aims to address several key issues in the field of long video generation and provide references and guidance for future research. Specifically: 1. **Defining the Standard for Long Videos**: Current research lacks a unified definition of "long video" and a standard measurement scale. The paper proposes a new definition based on the number of frames (more than 100 frames) or duration (more than 10 seconds) to provide a clear benchmark. 2. **Technical Challenges of Long Video Generation**: Long video generation faces challenges such as hardware resource limitations, data scarcity, temporal consistency, and content continuity. The paper summarizes existing methods and techniques, such as Divide and Conquer and Temporal Autoregressive paradigms, to address these challenges. 3. **Review of Existing Models and Techniques**: The paper provides a detailed introduction to four popular video generation models (Diffusion Models, Autoregressive Models, Generative Adversarial Networks, and Masked Modeling), as well as the application of control signals (text prompts, image prompts, and video prompts) in long video generation. 4. **Future Trends and Opportunities**: The paper discusses current issues in the field of long video generation and looks forward to future research directions, including improving video quality, enhancing the spatiotemporal consistency of models, and increasing the diversity and coherence of long videos. Through this review paper, the authors hope to provide a comprehensive reference guide for researchers and practitioners, promoting the development of long video generation technology.