Combinatorial Complex Score-based Diffusion Modelling through Stochastic Differential Equations

Adrien Carrel
2024-06-07
Abstract:Graph structures offer a versatile framework for representing diverse patterns in nature and complex systems, applicable across domains like molecular chemistry, social networks, and transportation systems. While diffusion models have excelled in generating various objects, generating graphs remains challenging. This thesis explores the potential of score-based generative models in generating such objects through a modelization as combinatorial complexes, which are powerful topological structures that encompass higher-order relationships. In this thesis, we propose a unified framework by employing stochastic differential equations. We not only generalize the generation of complex objects such as graphs and hypergraphs, but we also unify existing generative modelling approaches such as Score Matching with Langevin dynamics and Denoising Diffusion Probabilistic Models. This innovation overcomes limitations in existing frameworks that focus solely on graph generation, opening up new possibilities in generative AI. The experiment results showed that our framework could generate these complex objects, and could also compete against state-of-the-art approaches for mere graph and molecule generation tasks.
Machine Learning,Social and Information Networks,Algebraic Topology
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is the limitations of existing graph generation methods when dealing with complex high - order relationships. Specifically, although diffusion models have achieved remarkable results in generating various objects, they still face challenges when generating graph structures. These problems mainly stem from the complexity of graphs, graphs of different sizes, and potential high - order relationships. By introducing a fractional diffusion model framework based on Combinatorial Complexes (CCs), the paper aims to overcome these obstacles and expand the application scope of generative AI models. This framework can not only generate graphs, but also generate more complex high - dimensional topological entities, such as hypergraphs and simplicial complexes, thus providing more natural and general solutions for tasks such as molecule generation. The core contributions of the paper include: 1. **Introduction of the CCSD framework**: A new fractional diffusion model (CCSD) is proposed, which uses stochastic differential equations (SDEs) to generate combinatorial complexes, surpassing traditional graph generation methods. 2. **Innovation of mathematical objects**: New mathematical objects are introduced, incorporating combinatorial complexes into the broader context of generative AI. 3. **Design of neural network architectures**: Operators are designed and re - defined to support neural network architectures for processing high - order topological structures. 4. **Learning of partial fraction functions**: New layers and neural network architectures are proposed for learning partial fraction functions. 5. **Object transformation**: A procedure for transforming low - dimensional graphs (such as molecules) into combinatorial complexes is developed, including an improved path - lifting method. 6. **New evaluation metrics**: New evaluation metrics are proposed for evaluating the quality of the generated combinatorial complexes and the original object distribution. 7. **Development of a Python library**: A Python library named CCSD is developed, providing tools for model training and sampling. Through these contributions, the paper aims to promote the development of generative models so that they can generate rich topological structures, thereby achieving breakthroughs in fields such as drug discovery.