Large-Scale Detection of Telomeric Motif Sequences in Genomic Data Using TelFinder.

Qing Sun,Hao Wang,Shiheng Tao,Xuguang Xi
DOI: https://doi.org/10.1128/spectrum.03928-22
IF: 3.7
2023-01-01
Microbiology Spectrum
Abstract:Telomeres are regions of tandem repeated sequences at the ends of linear chromosomes that protect against DNA damage and chromosome fusion. Telomeres are associated with senescence and cancers and have attracted the attention of an increasing number of researchers. However, few telomeric motif sequences are known. Given the mounting interest in telomeres, an efficient computational tool for the de novo detection of the telomeric motif sequence of new species is needed since experimental-based methods are costly in terms of time and effort. Here, we report the development of TelFinder, an easy-to-use and freely available tool for the de novo detection of telomeric motif sequences from genomic data. The vast quantity of readily available genomic data makes it possible to apply this tool to any species of interest, which will undoubtedly inspire studies requiring telomeric repeat information and improve the utilization of these genomic data sets. We have tested TelFinder on telomeric sequences available in the Telomerase Database, and the detection accuracy reaches 90%. In addition, variation analyses in telomere sequences can be performed by TelFinder for the first time. The telomere variation preference of different chromosomes and even at the ends of the chromosome can provide clues regarding the underlying mechanisms of telomeres. Overall, these results shed new light on the divergent evolution of telomeres. IMPORTANCE Telomeres are reported to be highly correlated with the cell cycle and aging. As a result, research on telomere composition and evolution has become more and more urgent. However, using experimental methods to detect telomeric motif sequences is slow and costly. To combat this challenge, we developed TelFinder, a computational tool for the de novo detection of the telomere composition only using genomic data. In this study, we showed that a lot of complicated telomeric motifs could be identified by TelFinder only using genomic data. In addition, TelFinder can be used to check variation analyses in telomere sequences, which could lead to a deeper understanding of telomere sequences.
What problem does this paper attempt to address?