Modern tools for annotation of small genomes of non-model eukaryotes

Marina Galchenkova,Aleksei Korzhenkov
DOI: https://doi.org/10.48550/arXiv.2102.04058
IF: 4.31
2021-02-08
Genomics
Abstract:Nowadays, due to the increasing amount of experimental data obtained by sequencing, the most interest is focused on determining the functions and characteristics of its individual parts of the genome instead of determining the nucleotide sequence of the genome. The genome annotation includes the identification of coding and non-coding sequences, determining the structure of the gene and determining the functions of these sequences. Despite the significant achievements in computational technologies working with sequencing data, there is no general approach to the functional annotation of the genome in the reason of the large number of unresolved molecular determination of the function of some genomes parts. Nevertheless, the scientific community is trying to solve this problem. This review analyzed existing approaches to eukaryotic genome annotation. This work includes 3 main parts: introduction, main body and discussion. The introduction reflects the development of independent tools and automatic pipelines for annotation of eukaryotic genomes, which are associated with existing achievements in annotating prokaryotic ones. The main body consists of two distinguished parts, the first one is devoted to instructions for annotating genomes of non-model eukaryotes, and the second block is about recent versions of automatic pipelines that require minimal user's curation. The question of assessing the quality and completeness of the annotated genome is noted briefly, and the tools to conduct this analysis are discussed. Currently, there is no universal automatic software for eukaryotic genome annotation, covering the whole list of tasks, without manual curation or using additional external tools and resources. Thus it leads to the task of developing a wider functional and universal protocol for automatic annotation of small eukaryotic genomes.
What problem does this paper attempt to address?