Build a Bioinformatic Analysis Platform and Apply it to Routine Analysis of Microbial Genomics and Comparative Genomics
Hualin Liu,Bingyue Xin,Jinshui Zheng,Hao Zhong,Yun Yu,Donghai Peng,Ming Sun
DOI: https://doi.org/10.21203/rs.2.21224/v5
2021-01-01
Abstract:More and more frequently, genomics and comparative genomics have been used as routine methods for general microbiological research. However, using several tools or even writing some scripts are required for completing a simple analysis, which is complicated for most biological researchers. To simplify the operation process, particularly for the convenience of microbiologists, here we have developed PGCGAP, a comprehensive, malleable, and easily installed prokaryotic genomic and comparative genomic analysis pipeline. PGCGAP implements genome assembly, gene prediction and annotation, genome and metagenome distance estimation, phylogenetic analysis, COG annotation, pan-genome analysis, inference of orthologous gene groups, variant calling and annotation, and screening for antimicrobial and virulence genes. Although we have tried our best to simplify the installation and usage of PGCGAP, it may be difficult for non-bioinformaticians to master it. Therefore, a protocol was created to help microbiologists without any experience in bioinformatics to establish their bioinformatics platform and perform routine analyses. This protocol shows how to choose the equipment to install a Linux subsystem on a laptop with a Windows 10 system, to install the PGCGAP and perform all analyses with an example dataset. The protocol requires a basic understanding of Linux, so an additional web page was written to help uninitiated users learn Linux and whole-genome sequencing (https://github.com/liaochenlanruo/pgcgap/wiki/Learning-bioinformatics or http://bcam.hzau.edu.cn/linuxwgs.php).