Algorithms for determining transposons in gene sequences

Yue Wang
DOI: https://doi.org/10.48550/arXiv.1506.02424
2022-09-01
Abstract:Some genes can change their relative locations in a genome. Thus for different individuals of the same species, the orders of genes might be different. Such jumping genes are called transposons. A practical problem is to determine transposons in given gene sequences. Through an intuitive rule, we transform the biological problem of determining transposons into a rigorous mathematical problem of determining the longest common subsequence. Depending on whether the gene sequence is linear (each sequence has a fixed head and tail) or circular (we can choose any gene as the head, and the previous one is the tail), and whether genes have multiple copies, we classify the problem of determining transposons into four scenarios: (1) linear sequences without duplicated genes; (2) circular sequences without duplicated genes; (3) linear sequences with duplicated genes; (4) circular sequences with duplicated genes. With the help of graph theory, we design fast algorithms for different scenarios. We also derive some results that might be of theoretical interests in combinatorics.
Genomics,Computational Engineering, Finance, and Science,Data Structures and Algorithms
What problem does this paper attempt to address?