AttSiOff: A self-attention-based approach on siRNA design with inhibition and off-target effect prediction

Bin Liu,Ye Yuan,Xiaoyong Pan,Hongbin Shen,Cheng Jin
DOI: https://doi.org/10.1101/2023.11.24.568517
2023-01-01
Abstract:Motivation Small interfering RNA (siRNA) is often used for function study and expression regulation of specific genes, as well as the development of small molecule drugs. Selecting siRNAs with high inhibition and low off-target effects from massive candidates is always a great challenge. Increasing experimentally validated samples prompt the development of machine-learning-based algorithms, including Support Vector Machine (SVM), Convolutional Neural Network (CNN), and Graph Neural Network (GNN). However, these methods still suffer from limited accuracy and poor generalization to design both potent and specific siRNAs. Results In this study, we propose a novel approach for siRNA inhibition and off-target effect prediction, named AttSiOff. It combines self-attention-based siRNA inhibition predictor with an mRNA searching package and an off-target filter. The predictor gives the inhibition score via analyzing the embedding of siRNA and local mRNA sequences, generated from pre-trained RNA-FM model, as well as other meaningful prior-knowledge-based features. Self-attention mechanism can detect potentially decisive features, which may determine the inhibition of siRNA. It captures global and local dependencies more efficiently than normal convolutions. The 10-fold cross-validation results indicate that our model achieves a significant improvement of correlation between prediction and label, compared with all existing methods. And it reaches better performance of generalization and robustness on cross-dataset validation. In addition, the mRNA searching package could find all mature mRNAs for given gene name from GENOMES database, and the off-target filter can calculate the amount of unwanted off-target binding sites, which affects the specificity of siRNA. Experiments on mature siRNA drugs show that our entire framework, AttSioff, have excellent convenience and operability in practical applications. Contact yuanye_auto{at}sjtu.edu.cn or chengjin520{at}sjtu.edu.cn. ### Competing Interest Statement The authors have declared no competing interest. * RNA-FM : RNA fundamental model RNAi : RNA interfering siRNA : small interfering RNA CNN : Convolutional Neural Network GNN : Graph Neural Network PCC : Pearson Correlation Coefficient SPCC : Spearman Correlation Coefficient AUC : the Area under the Receiver Operating Characteristic curve
What problem does this paper attempt to address?