an R/Bioconductor package for the identification, classification, and analysis of gene and genome duplications

Fabricio Almeida-Silva,Yves Van de Peer
DOI: https://doi.org/10.1101/2024.02.27.582236
2024-02-29
Abstract:Gene and genome duplications are major evolutionary forces that shape the diversity and complexity of life. However, different duplication modes have distinct impacts on gene function, expression, and regulation. Existing tools for identifying and classifying duplicated genes are either outdated or not user-friendly. Here, we present , an R/Bioconductor package that provides a comprehensive and robust framework for analyzing duplicated genes from genomic data. can detect and classify gene pairs as derived from six duplication modes (segmental, tandem, proximal, retrotransposon-derived, DNA transposon-derived, and dispersed duplications), calculate substitution rates, detect signatures of putative whole-genome duplication events, and visualize results as publication-ready figures. We applied to classify the duplicated gene repertoire in 822 eukaryotic genomes, which we made available through a user-friendly web interface (available at ). is freely accessible from Bioconductor ( ), and it provides a valuable resource to study the evolutionary consequences of gene and genome duplications.
Bioinformatics
What problem does this paper attempt to address?