[Molecular Cloning, Characterization, Chromosomal Assignment, Genomic Organization and Verification of SFRS12(SRrp508), a Novel Member of Human SR Protein Superfamily and a Human Homolog of Rat SRrp86].
De-Li Zhang,Xiao-Jing Sun,Lun-Jiang Ling,Run-Sheng Chen,Da-Long Ma
2002-01-01
Abstract:We have identified and characterized a novel human serine-arginine-rich (SR) splicing regulatory protein 508 (SRrp508) gene that is related to other members of the growing SR superfamily, but only homologous to rat (Rattus norvegicus) serine-arginine-rich splicing regulatory protein 86 (SRrp86) gene. The full-length cDNA of 3811 bp for human SRrp508 was cloned through a blast search of public databases following the identification of a cDNA contig of 658 bp obtained by EST assembly with full robotization in supercomputer in large-scale. Structurally, human SRrp508 encodes a polypeptide of 508 amino acids, which contains a single amino-terminal RNA recognition motif (RRM) and two carboxy-terminal domains rich in serine-arginine dipeptides that are highly conserved among other members of the SR superfamily. The conserved SR and RRM domains emphasize the biological importance of this gene. The SRrp508 gene, which contains 12 exons ranging from 0.096 to 2.093 kb and 11 introns ranging from 0.14 to 5.153 kb, is mapped to the human cytogenetic region 5q11.2-q12.1 using the bioinformatic analysis, and it does not link to any other genes. Furthermore, we have experimentally cloned and sequenced a cDNA fragment of 1680 bp containing the full-length ORF of 1527 bp in this novel human gene by RT-PCR from the single-stranded human pancreas cDNA library (Clontech), which is fully identical with that of the in silico cloning determined by the nucleotide sequencing. Thus, we in silico cloned his gene with GenBank accession number of AF459094 identified solely by bioinformatic analysis of the nucleotide and protein. This novel gene has promotors, TATA-box, several stop codons in the upstream of ORF, and PolyA signal in the downstream of ORF. Based on the above results, it can be concluded that we have obtained a complete novel human gene. The gene sequence exhibits good overall homology to that of rat SRrp86 gene, with 84% and 86% identity over the full-length nucleotide and protein, respectively, and with 96% and 86% identity over the serine-rich domain (RS) or arginine-rich domain (RA), respectively. The full-length sequence exhibits little overall homology to any other known protein at either the nucleotide or the amino acid level. The other two most closely related proteins, with 34% and 35% identity over the full-length protein, respectively, or with 51% and 54% identity over the full-length nucleotide of ORF, respectively, are drosophila serine-arginine-rich protein 54 (SRp54) and human arginine-rich nuclear protein 54 (p54). When comparisons are restricted to the RS or RA domains, the percent identity increased for both SRp54 and p54 are 44% and 54% or 38% and 43%, respectively. These results well demonstrate that only the novel human protein of 508 amino acids cloned is the human homolog of rat SRrp86, thus correcting the standpoint made by Barnard and Patton (Barnard DC, Patton JG. Identification and Characterization of a Novel Serine-Arginine-Rich Splicing Regulatory Protein. Molecular and Cellular Biology, 2000, 20(9): 3049-3057) that human arginine-rich nuclear protein 54 (p54) is the human homolog of the rat SRrp86, and suggesting that human SRrp508 is a new member of this growing superfamily of SR proteins. SRrp508 has an extensive expression profile, and may be a transcriptional factor. On the basis of its sequence and functional properties, we have named this protein SRrp508 for SR-related splicing regulatory protein of 508 amino acids. In summary, by combining bioinformatic analysis with experimental verification, we have successfully cloned the human cDNA homolog of rat SRrp86, which is verified by a series of theoretical and experimental evidence. The HGNC has just given SRrp508 gene entry the nomenclature information containing APPROVED SYMBOL: SFRS12; NAME: splicing factor, arginine/serine-rich 12; and ALIAS: DKFZp564B176, SRrp86. We have cloned this gene for near one year with no person landing the GenBank for registering the same gene. Our newly-established technique line will be helpful in discovering much more novel human genes.