Assessment of bacterial ncRNA gene prediction tools using bacterial genomes with different G+C content

LIU Lin-Meng,WEN Quan,OU Hong-Yu
DOI: https://doi.org/10.13344/j.microbiol.china.140280
2014-01-01
Microbiology
Abstract:[Objective] Bacterial ncRNAs are a versatile class of non-coding RNA which plays an important role in the process of microbial life. In this study, we assess three ncRNA gene-prediction tools used frequently with the different bacterial genomes. [Methods] Prediction tools representing the position weight matrix method (sRNAscanner), comparative genomics method (sRNAPredict) and machine learning method (PORTRAIT) were tested by using 7 BSRD-archived bacterial genomes with low, middle and high G+C contents, each of which contains more than 30 experimentally verified ncRNA genes. A set of genomic G+C content-associated position weight matrixes of transcription initiation and termination regions of ncRNA genes was generated and 2584 微生物学通报 Microbiol. China 2014, Vol.41, No.12 http://journals.im.ac.cn/wswxtbcn employed to test sRNAscanner prediction. [Results] The sRNAPredict tool had higher specificity and positive prediction value, but lower sensitivity than PORTRAIT. The performance of both tools varied with the selected strains of different G+C contents. The obtained G+C content-associated matrix slightly improved the average accuracy of sRNAscanner. [Conclusion] The changing accuracy of the bacterial ncRNA gene detection tools under study was attributed to genomic G+C heterogeneity. Conserved sequence features of ncRNA gene promoters and terminators in genomes sharing similar G+C contents may be helpful to enhance bacterial ncRNA genes prediction.
What problem does this paper attempt to address?