Functional yeast promoter sequence design using temporal convolutional generative language models

Ibrahim Alsaggaf,Cen Wan
DOI: https://doi.org/10.1101/2024.10.22.619701
2024-10-25
Abstract:Promoter sequence design is the key to accurately control gene expression processes that play a crucial role in biological systems. Thanks to the recent community effort, we are now able to elucidate the associations between yeast promoter sequences and their corresponding expression levels using advanced deep learning methods. This milestone boosts the further development of many downstream biological sequence research tasks like synthetic DNA design. In this work, we propose a novel synthetic promoter sequence design method, namely Gen-DNA-TCN, which exploits a pre-trained sequence-to-expression predictive model to facilitate its temporal convolutional neural networks-based generative model training. A large-scale evaluation suggests that Gen-DNA-TCN successfully generated diverse synthetic promoter sequences that also encode similar distributions of transcription factor binding sites to real promoter sequences.
Bioinformatics
What problem does this paper attempt to address?