The Matrix Method of Representation, Analysis and Classification of Long Genetic Sequences

Ivan Stepanyan,Sergey Petoukhov
DOI: https://doi.org/10.3390/info8010012
2017-01-17
Information
Abstract:The article is devoted to a matrix method of comparative analysis of long nucleotide sequences by means of presenting each sequence in the form of three digital binary sequences. This method uses a set of symmetries of biochemical attributes of nucleotides. It also uses the possibility of presentation of every whole set of N-mers as one of the members of a Kronecker family of genetic matrices. With this method, a long nucleotide sequence can be visually represented as an individual fractal-like mosaic or another regular mosaic of binary type. In contrast to natural nucleotide sequences, artificial random sequences give non-regular patterns. Examples of binary mosaics of long nucleotide sequences are shown, including cases of human chromosomes and penicillins. The obtained results are then discussed.
What problem does this paper attempt to address?