HiTE: a Fast and Accurate Dynamic Boundary Adjustment Approach for Full-Length Transposable Element Detection and Annotation

Kang Hu,Peng Ni,Minghua Xu,You Zou,Jianye Chang,Xin Gao,Yaohang Li,Jue Ruan,Bin Hu,Jianxin Wang
DOI: https://doi.org/10.1038/s41467-024-49912-8
IF: 16.6
2024-01-01
Nature Communications
Abstract:Recent advancements in genome assembly have greatly improved the prospects for comprehensive annotation of Transposable Elements (TEs). However, existing methods for TE annotation using genome assemblies suffer from limited accuracy and robustness, requiring extensive manual editing. In addition, the currently available gold-standard TE databases are not comprehensive, even for extensively studied species, highlighting the critical need for an automated TE detection method to supplement existing repositories. In this study, we introduce HiTE, a fast and accurate dynamic boundary adjustment approach designed to detect full-length TEs. The experimental results demonstrate that HiTE outperforms RepeatModeler2, the state-of-the-art tool, across various species. Furthermore, HiTE has identified numerous novel transposons with well-defined structures containing protein-coding domains, some of which are directly inserted within crucial genes, leading to direct alterations in gene expression. A Nextflow version of HiTE is also available, with enhanced parallelism, reproducibility, and portability. Existing methods for detecting transposable elements (TEs) in genome assemblies have limited accuracy and robustness, and the results often require extensive manual editing. Here, the authors present a fast and accurate dynamic boundary adjustment approach that improves detection and annotation of full-length TEs across various species.
What problem does this paper attempt to address?