HiFiBGC: an ensemble approach for improved biosynthetic gene cluster detection in PacBio HiFi-read metagenomes

Amit Yadav,Srikrishna Subramanian
DOI: https://doi.org/10.1186/s12864-024-10950-7
IF: 4.547
2024-11-22
BMC Genomics
Abstract:Microbes produce diverse bioactive natural products with applications in fields such as medicine and agriculture. In their genomes, these natural products are encoded by physically clustered genes known as biosynthetic gene clusters (BGCs). Genome and metagenome sequencing advances have enabled high-throughput identification of BGCs as a promising avenue for natural product discovery. BGC mining from (meta)genomes using in silico tools has allowed access to a vast diversity of potentially novel natural products. However, a fundamental limitation has been the ability to assemble complete BGCs, especially from complex metagenomes. With their fragmented assemblies, short-read technologies struggle to recover complete BGCs, such as the long and repetitive nonribosomal peptide synthetase (NRPS) and polyketide synthase (PKS). Recent advances in long-read sequencing, such as the High Fidelity (HiFi) technology from PacBio, have reduced this limitation and can help retrieve both accurate and complete BGCs from metagenomes, warranting improvement in the existing BGC identification approach for better utilization of HiFi data.
genetics & heredity,biotechnology & applied microbiology
What problem does this paper attempt to address?