EvoAug-TF: Extending evolution-inspired data augmentations for genomic deep learning to TensorFlow

Yiyang Yu,Shivani Muthukumar,Peter K Koo
DOI: https://doi.org/10.1101/2024.01.17.575961
2024-01-18
Abstract:Deep neural networks (DNNs) have been widely applied to predict the molecular functions of regulatory regions in the non-coding genome. DNNs are data hungry and thus require many training examples to fit data well. However, functional genomics experiments typically generate limited amounts of data, constrained by the activity levels of the molecular function under study inside the cell. Recently, EvoAug was introduced to train a genomic DNN with evolution-inspired augmentations. EvoAug-trained DNNs have demonstrated improved generalization and interpretability with attribution analysis. However, EvoAug only supports PyTorch-based models, which limits its applications to a broad class of genomic DNNs based in TensorFlow. Here, we extend EvoAug’s functionality to TensorFlow in a new package we call EvoAug-TF. Through a systematic benchmark, we find that EvoAug-TF yields comparable performance with the original EvoAug package.
Bioinformatics
What problem does this paper attempt to address?