Renormalization Group flow, Optimal Transport and Diffusion-based Generative Model

Artan Sheshmani,Yi-Zhuang You,Baturalp Buyukates,Amir Ziashahabi,Salman Avestimehr
2024-03-01
Abstract:Diffusion-based generative models represent a forefront direction in generative AI research today. Recent studies in physics have suggested that the renormalization group (RG) can be conceptualized as a diffusion process. This insight motivates us to develop a novel diffusion-based generative model by reversing the momentum-space RG flow. We establish a framework that interprets RG flow as optimal transport gradient flow, which minimizes a functional analogous to the Kullback-Leibler divergence, thereby bridging statistical physics and information theory. Our model applies forward and reverse diffusion processes in Fourier space, exploiting the sparse representation of natural images in this domain to efficiently separate signal from noise and manage image features across scales. By introducing a scale-dependent noise schedule informed by a dispersion relation, the model optimizes denoising performance and image generation in Fourier space, taking advantage of the distinct separation of macro and microscale features. Experimental validations on standard datasets demonstrate the model's capability to generate high-quality images while significantly reducing training time compared to existing image-domain diffusion models. This approach not only enhances our understanding of the generative processes in images but also opens new pathways for research in generative AI, leveraging the convergence of theoretical physics, optimal transport, and machine learning principles.
Disordered Systems and Neural Networks
What problem does this paper attempt to address?
This paper mainly discusses a new approach that combines the Renormalization Group Flow theory with Optimal Transport and diffusion-based generative models. Inspired by the interpretation of the Renormalization Group as a diffusion process in physics, the authors propose a diffusion generative model called Reverse Renormalization Group Flow. By interpreting the Renormalization Group Flow as an optimal transport gradient flow that minimizes Kullback-Leibler divergence, the paper establishes a connection between statistical physics and information theory. The model applies forward and backward diffusion processes in Fourier space to effectively separate signal and noise by exploiting the sparse representation of natural images in the Fourier domain, and manages image features at different scales. By introducing a scale-dependent noise time schedule based on a dispersion relationship, the model optimizes denoising performance and image generation. Experimental results show that the model generates high-quality images while reducing training time, outperforming existing image domain diffusion models. In addition, the paper reviews the fundamental concepts of Optimal Transport theory, such as the Monge and Kantorovich formulations, and their applications in image processing, machine learning, and natural language processing. The paper proposes a new approach called Frequency Domain Diffusion Model (FDDM), which utilizes diffusion processes in Fourier space to improve computational efficiency while preserving image quality. In summary, this paper aims to develop more efficient and high-quality image generation models by integrating principles from mathematics, physics, and machine learning, deepen the understanding of the image generation process, and open up new avenues for research in generative artificial intelligence.