siRNADesign: A Graph Neural Network for siRNA Efficacy Prediction via Deep RNA Sequence Analysis
Rongzhuo Long,Ziyu Guo,Da Han,Xudong Yuan,Guangyong Chen,Pheng Ann Heng,Liang Zhang
DOI: https://doi.org/10.1101/2024.04.28.591509
2024-05-28
Abstract:The clinical adoption of small interfering RNAs (siRNAs) has prompted the development of various computational strategies for siRNA design, from traditional data analysis to advanced machine learning techniques. However, previous studies have inadequately considered the full complexity of the siRNA silencing mechanism, neglecting critical elements such as siRNA positioning on mRNA, RNA base-pairing probabilities, and RNA-AGO2 interactions, thereby limiting the insight and accuracy of existing models. Here, we introduce siRNADesign, a Graph Neural Network (GNN) framework that leverages both non-empirical and empirical rules-based features of siRNA and mRNA to effectively capture the complex dynamics of gene silencing. In multiple internal datasets, siRNADesign achieves state-of-the-art performance. Significantly, siRNADesign also outperforms existing methodologies in in vitro wet lab experiments and an externally validated dataset. Additionally, we develop a new data-splitting methodology that addresses the data leakage issue, a frequently overlooked issue in previous studies, ensuring the robustness and stability of our model under various experimental settings. Through rigorous testing, siRNADesign has demonstrated remarkable predictive accuracy and robustness, making significant contributions to the field of gene silencing. Furthermore, our approach in redefining data-splitting standards aims to set new benchmarks for future research in the domain of predictive biological modeling for siRNA.
Bioinformatics