A novel termini analysis theory using HTS data alone for the identification of Enterococcus phage EF4-like genome termini

Xianglilan Zhang,Yahui Wang,Shasha Li,Xiaoping An,Guangqian Pei,Yong Huang,Hang Fan,Zhiqiang Mi,Zhiyi Zhang,Wei Wang,Yubao Chen,Yigang Tong
DOI: https://doi.org/10.1186/s12864-015-1612-3
IF: 4.547
2015-01-01
BMC Genomics
Abstract:Background Enterococcus faecalis and Enterococcus faecium are typical enterococcal bacterial pathogens. Antibiotic resistance means that the identification of novel E. faecalis and E. faecium phages against antibiotic-resistant Enterococcus have an important impact on public health. In this study, the E. faecalis phage IME-EF4, E. faecium phage IME-EFm1, and both their hosts were antibiotic resistant. To characterize the genome termini of these two phages, a termini analysis theory was developed to provide a wealth of terminal sequence information directly, using only high-throughput sequencing (HTS) read frequency statistics. Results The complete genome sequences of phages IME-EF4 and IME-EFm1 were determined, and our termini analysis theory was used to determine the genome termini of these two phages. Results showed 9 bp 3′ protruding cohesive ends in both IME-EF4 and IME-EFm1 genomes by analyzing frequencies of HTS reads. For the positive strands of their genomes, the 9 nt 3′ protruding cohesive ends are 5′-TCATCACCG-3′ (IME-EF4) and 5′-GGGTCAGCG-3′ (IME-EFm1). Further experiments confirmed these results. These experiments included mega-primer polymerase chain reaction sequencing, terminal run-off sequencing, and adaptor ligation followed by run-off sequencing. Conclusion Using this termini analysis theory, the termini of two newly isolated antibiotic-resistant Enterococcus phages, IME-EF4 and IME-EFm1, were identified as the byproduct of HTS. Molecular biology experiments confirmed the identification. Because it does not require time-consuming wet lab termini analysis experiments, the termini analysis theory is a fast and easy means of identifying phage DNA genome termini using HTS read frequency statistics alone. It may aid understanding of phage DNA packaging.
What problem does this paper attempt to address?