Concatenation of segmented viral genomes for reassortment analysis

A. Ivanova,P. Volchkov,A. Deviatkin
DOI: https://doi.org/10.1101/2024.02.26.582008
2024-02-26
Abstract:Most reassortment identification methods are based on searching for phylogenetic discrepancies between phylogenetic trees for different segments. Other methods use pairwise genetic distances or compare the position of individual genome components in a tree relative to a reference component of the viral genome. However, such approaches are labour-intensive and hardly scalable. Recent advances in the availability of viral sequencing technologies have led to the sequencing of large numbers of pathogen genomes, making manual processing of this large data difficult. At the same time, recombination analysis methods can process almost any number of sequences simultaneously. Such approaches are not suitable for the simultaneous analysis of multiple segments and are therefore not used to search for reassortment events. However, in the case of sequential concatenation of all segments, the methods of recombination analysis can be used to detect traces of reassortment events. The code is available at . The service is implemented as a web application . It concatenates segmented viral genomes for reassortment analysis. The tool accepts files in GenBank format as input and generates a set of sequences in fasta format that are sequentially concatenated sequences of viral segments named in accordance with the “strain” field of the GenBank record annotation. In order to use recombination search algorithms in the study of reassortment events, we have developed a method (Virus Segment Concatenator, VSC) to automatically concatenate the sequences of all segments of a virus into a single sequence. The applicability of VSC for automated searches for reassortment events was demonstrated using CCHFV, an H5N5 subtype of influenza virus.
Bioinformatics
What problem does this paper attempt to address?