Assignment of orthologous genes in unbalanced genomes using cycle packing of adjacency graphs

Gabriel Siqueira,Andre Rodrigues Oliveira,Alexsandro Oliveira Alexandrino,Géraldine Jean,Guillaume Fertin,Zanoni Dias
DOI: https://doi.org/10.1007/s10732-024-09528-z
2024-06-01
Journal of Heuristics
Abstract:The adjacency graph is a structure used to model genomes in several rearrangement distance problems. In particular, most studies use properties of a maximum cycle packing of this graph to develop bounds and algorithms for rearrangement distance problems, such as the reversal distance, the reversal and transposition distance, and the double cut and join distance. When each genome has no repeated genes, there exists only one cycle packing for the graph. However, when each genome may have repeated genes, the problem of finding a maximum cycle packing for the adjacency graph (adjacency graph packing) is NP-hard. In this work, we develop a randomized greedy heuristic and a genetic algorithm heuristic for the adjacency graph packing problem for genomes with repeated genes and unequal gene content. We also propose new algorithms with simple implementation and good practical performance for reversal distance and reversal and transposition distance in genomes without repeated genes, which we combine with the heuristics to find solutions for the problems with repeated genes. We present experimental results and compare the application of these heuristics with the application of the MSOAR framework in rearrangement distance problems. Lastly, we apply our genetic algorithm heuristic to real genomic data to validate its practical use.
computer science, artificial intelligence, theory & methods
What problem does this paper attempt to address?