Ultra-Sensitive Detection of Transposon Insertions Across Multiple Families by Transposable Element Display Sequencing

Pol Vendrell-Mir,Basile Leduque,Leandro Quadrana
DOI: https://doi.org/10.1101/2024.08.21.608910
2024-08-22
Abstract:Mobilization of transposable elements (TEs) can generate large effect mutations. However, because new TE insertions are challenging to detect and transposition is typically rare, the actual rate and landscape of new insertions remains unexplored for most TEs. Here, we introduce a TE display sequencing approach that leverages target amplification of TE extremities to detect non-reference TE insertions with high sensitivity and specificity. By implementing this approach on serial dilutions of genomic DNA from A. thaliana lines carrying different repertoires of new TE insertions, we show that the method can detect TE insertions that are present at frequencies as low as 1:250 000 within a DNA sample. In addition, TE display sequencing can be multiplexed to simultaneously detect insertions for distinct TE families, including both retrotransposons and DNA transposons, increasing its versatility and cost-effectiveness to investigate complex mobilomes. Importantly, when combined with nanopore sequencing, this approach enables the identification of insertions using long-reads and achieves a turn around time from DNA extraction to insertion identification of less than 24h, significantly reducing the time-to-answer. Analysis of TE insertions in large populations of A. thaliana plants undergoing a transposition burst demonstrate the power of the multiplex TE display sequencing to assess the rates and allele frequencies of heritable insertions, enabling its implementation to study large-scale evolve and resequence experiments. Furthermore, we found that ~6% of de novo TE insertions show recurrent allele frequency changes consistent with either positive or negative selection. We conclude that TE display sequencing is an ultra-sensitive, specific, quick, and cost-effective approach to investigate the rate and landscape of new insertions for multiple TEs in large scale population experiments. We provide a step-by-step experimental protocol as well as ready-to-use bioinformatic pipelines, ensuring straightforward implementation of the method.
Genomics
What problem does this paper attempt to address?