DNA-Diffusion: Leveraging Generative Models for Controlling Chromatin Accessibility and Gene Expression via Synthetic Regulatory Elements
Lucas Ferreira DaSilva,Simon Senan,Zain Munir Patel,Aniketh Janardhan Reddy,Sameer Gabbita,Zach Nussbaum,César Miguel Valdez Córdova,Aaron Wenteler,Noah Weber,Tin M. Tunjic,Talha Ahmad Khan,Zelun Li,Cameron Smith,Matei Bejan,Lithin Karmel Louis,Paola Cornejo,Will Connell,Emily S. Wong,Wouter Meuleman,Luca Pinello
DOI: https://doi.org/10.1101/2024.02.01.578352
2024-02-01
Abstract:The challenge of systematically modifying and optimizing regulatory elements for precise gene expression control is central to modern genomics and synthetic biology. Advancements in generative AI have paved the way for designing synthetic sequences with the aim of safely and accurately modulating gene expression. We leverage diffusion models to design context-specific DNA regulatory sequences, which hold significant potential toward enabling novel therapeutic applications requiring precise modulation of gene expression. Our framework uses a cell type-specific diffusion model to generate synthetic 200 bp regulatory elements based on chromatin accessibility across different cell types. We evaluate the generated sequences based on key metrics to ensure they retain properties of endogenous sequences: transcription factor binding site composition, potential for cell type-specific chromatin accessibility, and capacity for sequences generated by DNA diffusion to activate gene expression in different cell contexts using state-of-the-art prediction models. Our results demonstrate the ability to robustly generate DNA sequences with cell type-specific regulatory potential. DNA-Diffusion paves the way for revolutionizing a regulatory modulation approach to mammalian synthetic biology and precision gene therapy.
Synthetic Biology