Crystal Structure of B-Subtilis Yjcg Characterizing the Yjcg-Like Group of 2h Phosphoesterase Superfamily
Dan Li,Cong Liu,Yu-He Liang,Lan-Fen Li,Xiao-Dong Su
DOI: https://doi.org/10.1002/prot.22093
2008-01-01
Proteins Structure Function and Bioinformatics
Abstract:The 2H phosphoesterase superfamily was first identified by Mazumder et al.1 This superfamily of enzymes shares a common catalytic site and preserves the activity of cyclic phosphodiesterase, although their protein sequences are extremely diversified. The superfamily was named after the two His residues from two conserved tetrapeptide motifs, HX(T/S)X (X, a hydrophobic residue), in the active site. Site-directed mutagenesis indicated that the two His residues were essential for the enzyme activity.2 The conserved two HX(T/S)X motifs is the typical characteristic of the superfamily. The superfamily of enzymes ubiquitously exists in almost all known species. They are even traceable to the last universal common ancestor (LUCA) of all known life forms involving in RNA metabolism, and have subsequently evolved to take several distinct biological roles.1 The 2H phosphoesterase superfamily has been classified into four major groups and a few divergent members. The four major groups include the archaeo-bacterial LigT-like group, the eukaryotic-viral LigT-like group, the YjcG-like group, and the mlr3352-like group.1 We have selected the Bacillus subtilis YjcG gene as one of the target genes due to its low homology to any structural known proteins during a structural genomics investigation at Peking University.3 Here we present the crystal structure of B. subtilis YjcG which is the archetype of the YjcG-like group. As the first three-dimensional structure of the group and the first 2H phosphoesterase from bacteria, it contributes to complete the structural landscape of the 2H phosphoesterase superfamily and would throw lights on the functional investigation of the YjcG-like group. The yjcG gene was amplified by polymerase chain reaction (PCR) from the genomic DNA of Bacillus subtilis strain 168 genomic DNA and constructed into pET21-DEST as reported in our previous work.4 Substituted SeMet protein was expressed in E. coli strain BL21(DE3) cultured in the minimal medium (M9). The methionine production was shut down and SeMet was incorporated as described by van Duyne et al.5 The purification and crystallization of the SeMet incorporated protein were carried out as the natural protein as described previously.4 In brief, the protein was purified by the HiTrap Chelating and Superdex 75 columns (GE healthcare, Piscataway, NJ) successively to homogeneity. Crystallization was carried out using the hanging-drop vapor-diffusion method at 293 K. About 1-μL protein solution (10 mg/mL) was mixed with 1-μL reservoir solution (0.1M Bis-Tris pH 7.3, 24% PEG MME 2000) and equilibrated against 500-μL reservoir solution. Diffraction data of the SeMet YjcG were collected on a MAR CCD detector at beamline 3W1A, Beijing Synchrotron Radiation Facility (BSRF), China. The SeMet crystal was flash-frozen and maintained at 100 K using a nitrogen gas stream (Oxford Instruments) during data collection. The 10% (v/v) glycerol added to the mother liquor was used as cryo-protectant. Three data sets have been collected around the Se absorption peak, edge and low energy remote to a highest resolution of 2.3 Å. The data were indexed and scaled with DENZO and SCALEPACK.6 The crystal belongs to the space group C2, with unit cell parameters of a = 99.3 Å, b = 73.8 Å, c = 61.6 Å, β = 113.5°. Data collection statistics were summarized in Table I. The Se sites and initial phases of YjcG were determined and refined from three-wavelength MAD (multiple-wavelength anomalous dispersion) data sets by using the program SOLVE/RESOLVE.7, 8 Automated RESOLVE model-building9 has been used to build about 70% main-chain. Further model building was performed manually using the graphical program O and refined by using CNS.10 The final R and Rfree factors are 23.7 and 26.7%, respectively, for reflections in the resolution range of 50–2.3 Å. The geometry was validated with the programs PROCHECK11 and 98.3% of the total residues were found in the allowed regions of the Ramachandran plot. The refinement statistics of YjcG structure were listed in Table I. Structure figures were created with Pymol (DeLano Scientific). The final model of YjcG contains 336 residues and 244 water molecules, which cover two molecules in one asymmetric unit. The full-length YjcG consists of 171 amino acids, but for both molecules, the last three resides could not be built into the model due to the poor electron density at the C-termini. The overall fold of the YjcG structure belongs to the α/β fold type [Fig. 1(a)]. Four α helices and eight β stands weaving back and forth construct two symmetric lobes. The so-called terminal lobe12 [Fig. 1(a)] harboring both ends of the protein consists of β1/β2/α2/α3/β7/β8; and the opposite lobe, the transit lobe [Fig. 1(a)], consists of α1/β3/β4/β5/α4/β6. The four β stands in each lobe compose an antiparallel β sheet. The two β sheets from the two compact lobes form the β barrel architecture. A water-filled cavity framed by the β barrel is the putative active center. The highly conserved catalytic HX(T/S)X repeats locate right in the cavity. The α helices wrap around the β barrel symmetrically. (a) YjcG monomeric structure and topology. The terminal lobe is in blue and the transit lobe is in green. The secondary elements are numbered from α1 to α4 and β1 to β8; (b) YjcG dimerization. The blue curve presents the exclusion of YjcG through Superdex 75 (GE healthcare) and the purple curve presents the standard protein marker. The ingredient molecular weights of the marker are labeled. The right graph shows the buried interface of YjcG dimer. The light blue area covers the interface residues of chain B. The broken line shows the symmetric axis. (c) Electrostatic surface potential comparison of YjcG_Bs, RNA ligase (1IHU_Tt), CPDase (1JH6_At), and CNPase (1WOJ_Hs) from left to right. YjcG behaved as dimer in both solution and crystal lattices. In solution, the calculated molecular weight of YjcG monomer is 19.6 KDa. The protein was eluted through the Superdex 75 column (GE healthcare, Piscataway, NJ) at a size about 40 KDa [Fig. 1(b)]. In one asymmetric unit of YjcG crystals, two molecules (chain A and B) were packed side by side. The average buried surface area was 983.2 Å2 (calculated by the EMBL-EBI online server PIZA), which is higher than the “threshold” given by Janin13 as being biologically significant. Fifteen residues at the N terminus and 16 residues at the C terminus of chain A interacted with the equivalent residues of chain B at the interface [Fig. 1(b)]. From the vertical view of the dimer, a two-fold axis could be observed [Fig. 1(b)]. The dimerization is a new feature that has not been observed in any other members of the 2H superfamily. However, the biological significance of the dimerization remains to be answered by the future investigation. According to the protein sequence alignment, the overall identities among the different family members are pretty low, except for the strictly conserved catalytic HX(T/S)X motifs [Fig. 2(a)]. Crystallographic studies on several members from other groups of 2H phosphoesterase superfamily have been reported.12, 15-19 Structural superposition shows that the homolog structures are not quite similar [Fig. 2(b)]. However, the side-chains of His and Thr/Ser residues in the HX(T/S)X signature motifs are structurally conserved [Fig. 2(b)]. Consequently, the catalytic activity of cyclic phosphodiesterases was inherited. Although the basic secondary structure elements and structural topology are conserved across the superfamily,1, 18 the overall structures have evolved to recruit new biological functions. Thus, the structural information could do great help to the assignment of the 2H phosphoesterase superfamily. (a) Structure-based multiple sequence alignment drawn by using ESPript 2.2.14 The sequences are labeled with their PDB IDs (except for YjcG) and the abbreviations of their source latin names: Bs, Bacillus subtilis; Tt, Thermus thermophilus; Ph, Pyrococcus horikoshii; At, Arabidopsis thaliana; Ca, Carassius auratus; Rn, Rattus norvegicus; Hs, Homo sapiens. The conserved HX(T/S)X motifs are framed by green boxes; (b) Structure superposition of 2H phosphoesterase homologs. A zoom-in view of the conserved HX(T/S)X motifs in the active cavity was displayed on the right. The color presentation was interpreted in the color bar. The electrostatic surface potential of archaeal RNA ligases shows that the active cavity is wide open and positively charged, which is compatible with their specific interaction with RNA substrates [Fig. 1(c)]. Whereas, the active center of plant CPDase (cyclic phosphodiesterase) turns to be a small cleft for its substrate Appr > p, so does the human CNPase (2′-3′ cyclic nucleotide phosphodiesterase) for CNP [Fig. 1(c)]. Comparing with them, the active center of YjcG is able to harbor bigger molecules than Appr > p or CNP, but not enough charged to interact with strongly negative charged molecules like RNA [Fig. 1(c)]. In conclusion, the crystallographic and biochemical studies on YjcG reported here will contribute to our understandings of the YjcG-like group of 2H phosphoesterase superfamily. As an ancient superfamily of enzymes, 2H phosphoesterases involve in several divergent biological processes. The protein functions evolved with the primary sequence mutation, whereas, to keep the catalytic activity, the structural topology and the catalytic residues (the HX(T/S)X motifs) are highly conserved. Thus, the 2H phosphoesterase superfamily offers a natural model to investigate the structural and functional evolution of proteins. The coordinates of YjcG have been deposited in the RCSB Protein Data Bank with accession code 2D4G. The authors thank the beamline staff (Drs. Yu-Hui Dong and Peng Liu) at BSRF for help with data collection.