Effective Identification of Bacterial Genomes From Short and Long Read Sequencing Data

Jian Liu,Jialiang Sun,Yongzhuang Liu
DOI: https://doi.org/10.1109/TCBB.2021.3095164
2022-01-01
IEEE/ACM Transactions on Computational Biology and Bioinformatics
Abstract:With the development of sequencing technology, microbiological genome sequencing analysis has attracted extensive attention. For inexperienced users without sufficient bioinformatics skills, making sense of sequencing data for microbial identification, especially for bacterial identification, through reads analysis is still challenging. In order to address the challenge of effectively analyzing genomic information, in this paper, we develop an effective approach and automatic bioinformatics pipeline called PBGI for bacterial genome identification, performing automatedly and customized bioinformatics analysis using short-reads or long-reads sequencing data produced by multiple platforms such as Illumina, PacBio and Oxford Nanopore. An evaluation of the proposed approach on the practical data set is presented, showing that PBGI provides a user-friendly way to perform bacterial identification through short or long reads analysis, and could provide accurate analyzing results. The source code of the PBGI is freely available at <uri xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">https://github.com/lyotvincent/PBGI</uri> .
What problem does this paper attempt to address?