GTax: improving de novo transcriptome assembly by removing foreign RNA contamination

Roberto Vera Alvarez,David Landsman
DOI: https://doi.org/10.1186/s13059-023-03141-2
IF: 17.906
2024-01-10
Genome Biology
Abstract:The cost and complexity of generating a complete reference genome means that many organisms lack an annotated reference. An alternative is to use a de novo reference transcriptome. This technology is cost-effective but is susceptible to off-target RNA contamination. In this manuscript, we present GTax, a taxonomy-structured database of genomic sequences that can be used with BLAST to detect and remove foreign contamination in RNA sequencing samples before assembly. In addition, we use a de novo transcriptome assembly of Solanum lycopersicum (tomato) to demonstrate that removing foreign contamination in sequencing samples reduces the number of assembled chimeric transcripts.
genetics & heredity,biotechnology & applied microbiology
What problem does this paper attempt to address?