Putation, Identification and Bioinformatics Analysis of Schistosoma Japonicum Aldehyde Dehydrogenase Full Coding Sequence

Wei Wang,De-Li Liu,Wei Hu,Zheng Feng,Zhong Yang
DOI: https://doi.org/10.3969/j.issn.1000-7423.2008.01.007
2008-01-01
Abstract:OBJECTIVE:To acquire the full coding sequence of Schistosoma japonicum aldehyde dehydrogenase, and fill the gaps of the partial aldehyde dehydrogenase sequences.METHOD:Putative sequence fragments of the S. japonicum aldehyde dehydrogenase were extracted from the transcriptome database by use of bioinformatics tools, through the multiple sequences alignment with homologous sequences of other species. Primers were designed according to the EST sequences matching the N terminal and C terminal respectively, and the gap sequence fragment was amplified by RT-PCR and sequenced. The full gene sequence was obtained finally by combining the old 2 EST sequences with the amplified sequence. The physico-chemical parameters of the new sequence were analyzed by using bioinformatics software.RESULT:Eight EST sequences of S. japonicum were predicted as partial sequences of aldehyde dehydrogenase. Two of which (AAW27891, AAW27047) were predicted to represent the N terminal and C terminal of one protein, respectively. The gap between them was deduced as about 80 amino acids according to the result of multiple sequences alignment. Primers located on the flanking of the gap were designed according to the known EST sequences of AAW27891 and AAW27047. The gap between the AAW27891 and AAW27047 were obtained by RT-PCR and then sequenced, as well as confirmed by bioinformatics software. The full sequence of aldehyde dehydrogenase was reassembled by filling of the gap sequence. The reassembled gene coding sequence was submitted to GenBank with an accession number of EF503564. The coding sequence contains an intact ORF of 1,596 bps with deduced 531 amino acids. Bioinformatic analysis of new amino acids sequence was performed as deduced molecular weight of 57 330.7 and PI value of 7.94. The aldehyde dehydrogenase pattern of [LIVMFGA]-E-[LIMSTAC]-[GS]-G-[KNLM]-[SADN]-[TAPFV] was found located in the position 290-297 of the new sequence.CONCLUSION:The gap between two partial nucleotide sequences is filled and the full coding sequence of aldehyde dehydrogenase gene has been obtained by the method combining bioinformatics tools and experiments together.
What problem does this paper attempt to address?