GALEON: a comprehensive bioinformatic tool to analyse and visualize gene clusters in complete genomes

Vadim A Pisarenco,Joel Vizueta,Julio Rozas
DOI: https://doi.org/10.1093/bioinformatics/btae439
IF: 5.8
2024-07-01
Bioinformatics
Abstract:Abstract Motivation Gene clusters, defined as a set of genes encoding functionally related proteins, are abundant in eukaryotic genomes. Despite the increasing availability of chromosome-level genomes, the comprehensive analysis of gene family evolution remains largely unexplored, particularly for large and highly dynamic gene families or those including very recent family members. These challenges stem from limitations in genome assembly contiguity, particularly in repetitive regions such as large gene clusters. Recent advancements in sequencing technology, such as long reads and chromatin contact mapping, hold promise in addressing these challenges. Results To facilitate the identification, analysis, and visualization of physically clustered gene family members within chromosome-level genomes, we introduce GALEON, a user-friendly bioinformatic tool. GALEON identifies gene clusters by studying the spatial distribution of pairwise physical distances among gene family members along with the genome-wide gene density. The pipeline also enables the simultaneous analysis and comparison of two gene families and allows the exploration of the relationship between physical and evolutionary distances. This tool offers a novel approach for studying the origin and evolution of gene families. Availability and implementation GALEON is freely available from https://www.ub.edu/softevol/galeon and https://github.com/molevol-ub/galeon
biochemical research methods,biotechnology & applied microbiology,mathematical & computational biology
What problem does this paper attempt to address?