Perl-based high-throughput nucleotide sequence extraction in soybean( Glycine max L.Merri)

ZHANG Da-yong,WANG Chang-biao,YI Jin-xin,HE Xiao-lan,MA Hong-xiang
2012-01-01
Abstract:With the completely sequencing and publishing of soybean genome sequence,elucidating the biological function and regulation web of each gene become an important project currently.In this study,the program for high-throughput nucleotide or amino acids sequence extraction for specific blocks and gene families in soybean whole genome was developed based on Perl program,which could analyze gene numbers included,the sequences and functions of genes in this region.For example,the genes contained between the region of Glyma04g00930.1-Glyma04g01740.1 on the chromosome 4 and the TIP(tonoplast intrinsic protein) genes which is a member of aquaporin gene family was extracted in this study.The result showed that the region of Glyma04g00930.1-Glyma04g01740.1 contained 112 genes and totally 91 different TIP genes existed in whole soybean genome respectively,and were distributed on each chromosome,among which,the largest of number was up to 10 on chromosome 12,implying they may play important roles in water use efficiency.Moreover,the number of intron of these genes ranged from 1-8.The extraction programs can be downloaded from the website:http://www.zlhyd.com/sxbi/soybean_strict.rar.
What problem does this paper attempt to address?