Dobali: a Domain-based Multiple Sequence Alignment Tool

Guang-Yen Chang,Chin-Liang Tsai,Tsung-Lun Wu,Cheng-Wen Chang,Chuan Yi Tang
2010-01-01
Abstract:Domains are the basic evolutionary units of proteins, and numerous proteins contain more than one domain. Therefore, a multiple sequence alignment tool that can value the importance of domain should be useful when working on multi-domain proteins. DOBALI, a domain-based multiple sequence alignment tool is proposed here. DOBALI consists of two major steps: Domain alignment and Segment alignment. Th e " domain architectures " of proteins are assigned and their shortest common supersequence is obtained in the first step. The segments from same domain types are aligned by one of the multiple sequence alignment methods incorporated in DOBALI in the second step. Therefore, DOBALI can be viewed as a MSA-preprocessing tool: it cuts sequences into segments according to domain boundaries, and tries to outline the final alignment based on the alignment of domains. The analysis of some glycosyl hydrolases indicated that the alignments obtained from DOBALI could provide better or similar results for those without preprocessing.
What problem does this paper attempt to address?