Not all predicted CRISPR–Cas systems are equal: isolated cas genes and classes of CRISPR like elements

Quan Zhang,Yuzhen Ye
DOI: https://doi.org/10.1186/s12859-017-1512-4
IF: 3.307
2017-02-06
BMC Bioinformatics
Abstract:BackgroundThe CRISPR–Cas systems in prokaryotes are RNA-guided immune systems that target and deactivate foreign nucleic acids. A typical CRISPR–Cas system consists of a CRISPR array of repeat and spacer units, and a locus of cas genes. The CRISPR and the cas locus are often located next to each other in the genomes. However, there is no quantitative estimate of the co-location. In addition, ad-hoc studies have shown that some non-CRISPR genomic elements contain repeat-spacer-like structures and are mistaken as CRISPRs.ResultsUsing available genome sequences, we observed that a significant number of genomes have isolated cas loci and/or CRISPRs. We found that 11%, 22% and 28% of the type I, II and III cas loci are isolated (without CRISPRs in the same genomes at all or with CRISPRs distant in the genomes), respectively. We identified a large number of genomic elements that superficially reassemble CRISPRs but don’t contain diverse spacers and have no companion cas genes. We called these elements false-CRISPRs and further classified them into groups, including tandem repeats and Staphylococcus aureus repeat (STAR)-like elements.ConclusionThis is the first systematic study to collect and characterize false-CRISPR elements. We demonstrated that false-CRISPRs could be used to reduce the false annotation of CRISPRs, therefore showing them to be useful for improving the annotation of CRISPR–Cas systems.
biochemical research methods,biotechnology & applied microbiology,mathematical & computational biology
What problem does this paper attempt to address?