Crystal Structure of the C‐terminal Conserved Domain of Human GRP, a Galectin‐related Protein, Reveals a Function Mode Different from Those of Galectins
Dongwen Zhou,Honghua Ge,Jianping Sun,Yongxiang Gao,Maikun Teng,Liwen Niu
DOI: https://doi.org/10.1002/prot.22003
2008-01-01
Abstract:Lectins are a group of proteins that recognize carbohydrates covalently linked to proteins and lipids on the cell surface and within the extracellular matrix and have diverse physiological functions1-3 including growth regulation,4-6 cell adhesion,7, 8 pre-mRNA splicing,9, 10 cell migration,11, 12 cell apoptosis,13-15 immune responses,16 and pathogen recognition.17 The galectins are a family of animal lectins defined by their shared conserved carbohydrate recognition domain (CRD) of about 130 amino acids and affinity for β-galactoside sugars.18, 19 The galectins can be classified into three subfamilies as proto-, chimera-, and tandem-repeat types based on their domain organization.2, 20 Gal-1, 2, 5, 7, 10, 11, 13, 14, and 15 are prototype galectins which contain one carbohydrate-recognition domain per subunit and are usually homodimers of noncovalently linked subunits.3, 21 Gal3 is the only member of the chimera-type galectin which contains a C-terminal CRD and an N-terminal slightly long peptide rich in praline and glycine.22 Gal4, 6, 8, 9, and 12 are examples of tandem-repeat galectins which contain two CRDs joined by a linked peptide and are monomeric.23 Fifteen mammalian galectins have been described so far.21, 24 Meanwhile, several galectin relatives such as lens crystalline protein GRIFIN and the hematopoietic stem cell precursor, HSPC159 have become available.24 The same property of these proteins is the significant sequence deviations at the most critical residues for carbohydrate-binding, leading to that they all lack β-galactosides binding activity. In the past years, the X-ray crystal structures of a few galectins such as gal-1,25-27 2,28, 29 3,30 7,31 and 1032 have been reported and they are all similar and show jelly-roll topologies typical of legume lectins. Their CRDs are all composed of 11 or 12-strand antiparallel β-sandwich. Some of them have short 310 helices. The general architectures of the carbohydrate-binding site in galectins of known three dimension structures are very similar. The structure of human Gal-1-β-galactoside complex reveals that the amino acids His44, Asn46, Arg48, Val59, Asn61, Trp68, Glu71, and Arg73 are directly involved in interactions with the bound disaccharide.27 GRP (previously known as HSPC159) is a novel human galectin-related protein whose gene was originally deduced by partial sequence alignment and confirmed by a full-length sequence for an mRNA isolated from CD34+ hematopoietic stem cells. The human GRP gene is located on chromosome 2p13 and is composed of five exons with exon/intron junctions located in positions generally conserved across the galectin family. GRP sequence is evolutionarily ancient and highly conserved as very similar cDNA sequences have been found in human, mouse, chicken, frog, and fish. GRP shares consensus amino acids at 51 of the 64 most highly conserved residues in other galectins. On the other hand, its sequence deviates significantly at five of the seven most critical residues for carbohydrate-binding.24, 33 The biological functions of hGRP remain unclear so far. In this work, we have determined the crystal structure of the C-terminal conserved domain of hGRP at 1.9 Å resolution. In this structure, hGRP-C adopts a fold of 10-strand antiparallel β-sandwich similar to that known for other galectin structures. However, the architectures of carbohydrate-binding site between hGRP-C and other known structural galectins are completely different, which suggests a novel mode in which GRP carries out its biological function in vivo. Protein expression, purification, crystallization, and data collection have been described elsewhere.34 The structure of hGRP-C was determined by molecular replacement using modified Gal-4 and Gal-7 structures (PDB code: 1 × 50 and 1BKZ, respectively) as alternative search models. A solution of four molecules in an asymmetric unit was found for data between 20.0 and 3.0 Å by the program PHASER.35 At the beginning of the refinement, residues with unclear side-chain electron density were mutated to alanines or glycines. Randomly selected 5% of data were set aside for the R-free36 calculation. The initial refinements included the rigid body refinement with the program REFMAC537 and the simulated annealing with the program CNS.38 Further refinement was performed using the restrained refinement in the program REFMAC5. The noncrystallographic symmetry restraints were set to medium. During the refining course, the model was examined and corrected manually by inspection of 2FO-FC and FO-FC electron density maps using the graphic program O39 and COOT.40 After a number of cycles of restrained refinements and the addition of water molecules, the structural model was finally refined to an R-factor of 21.7% and an R-free of 26.5%. The refinement statistics are shown in Table I. The program PROCHECK41 was used to evaluated the stereochemistry. Figures were created using PyMOL (DeLano Scientific). The final coordinates and structure factors have been deposited in protein data bank (http://www.rcsb.org/pdb) and are accessible under the accession code of 3B9C. protein hGRP-C pooled from Ni-NTA column was run on a size exclusion HiLoad™ 16/60 Superdex™ 75 column (Amersham Biosciences) with a fast protein liquid chromatography system (AKTA purifier). The column was equilibrated with running buffer (20 mM Tris-HCl, pH 8.0, and 200 mM NaCl) and calibrated using a series of molecular weight marker proteins for gel filtration chromatography according to the LMW Gel Filtration Calibration Kit (Amersham Biosciences). The elute of hGRP-C from Ni-NTA column was loaded onto α-lactose-agarose (SIGMA) column preequilibrated with PBS (pH 7.2) to test the lactose-binding ability of hGRP-C. The column was washed with PBS and then eluted with the same buffer containing 25 mM α-lactose. The refined structural model contains four polypeptide chains, seven sulfate ions, six thioglycol molecules, and 546 water molecules. The four monomers all comprise 134 amino acid residues (3–136) except for N-terminal hexa-histidine tag and the first two residues (Met-1, Val-2). The electron density of all residues was well-defined, except for side chains of Lys-9, Lys-17, and Asp-42. This failure was presumably due to their unstable conformation caused by their comparatively long side chains exposing to the solvent region. The final R-work and R-free of the model are 21.7% and 26.5%, respectively. The root-mean-square deviations (rms) from ideal bond lengths and angles are 0.011 Å and 1.320°, respectively. The Ramachandran plot, prepared using the program PROCHECK, shows that 86.5% and 13.5% of all residues fell within the most favored and additionally favored regions, respectively. The crystal structure of human GRP-C has been solved to 1.9 Å resolution using the molecular replacement method. GRP-C adopts a typical galectin folding involving a β-sandwich consisting of two antiparallel, five-stranded β-sheets (S1–S5 and F1–F5). Similar to galectin-7 and galectin-10, hGRP-C has a short 310 helix positioning between strands F5 and S2 [Fig. 1(a)]. This topology had been named "jelly-roll" by previous investigator and it is the typical folding pattern of galectins. Overall three-dimensional structure of hGRP-C. (a) Ribbon representation of the dimer organization of hGRP-C (chain C and chain D). Standard numbering of the secondary elements is indicated on one polypeptide (S1–S5/F1–F5, η1). (b) Tetramer arrangement of hGRP-C. The four monomers A, B, C, and D are colored green, yellow, cyan, and hotpink, respectively. The two homodimers (A,B and C,D) are approximately perpendicular. (c) Tetramer structure of hGRP-C viewed in other angle. Unlike most of the prototype galectins, which usually crystallize as dimers, hGRP-C has four monomers (chain A, B, C, D) in an asymmetric unit. The four molecules are highly identical: the root-mean-square deviations (RMSD) between Cα atoms positions between monomer A and B, C, D are 0.142, 0.108, 0.120 Å, respectively. Among the four monomers, monomers A and B are related by a noncrystallographic 2-fold symmetry axis with both the N and C termini at the far ends and form a homodimer, so do the monomers C and D. Then, these two homodimers, approximately perpendicularly arranging and two concave sides facing each other, form a tetramer which is shaped like a screw cap viewed in some angle [Fig. 1(b,c)]. Prototype galectins, except for galectin-5 which exists as a monomer in solution, are noncovalent homodimers composed of two identical CRDs.42 Human GRP-C has a striking noncrystallographic tetramer arrangement which is highly different from all other galectins, although it is present as a monomer in solution which was tested by a size exclusion chromatography experiment which gave a single peak for the protein hGRP-C with a molecular weight of ∼10.9 kDa [Fig. 2(c)]. Gal-1 and Gal-2 are canonical 2-fold symmetry dimeric and the two monomers are related by a 2-fold axis perpendicular to the plane of the β-sheets.26, 27, 29 The dimer interface in hGal-7 crystal involves the association of two β-sheets from the two monomers with an arrangement of two convex sides clinging to each other.31 The four monomers of the tetramer in hGRP-C crystal are related by two different noncrystallographic symmetries different from those of the fungal galectin CGL2 whose four monomers are interrelated by two perpendicular 2-fold axis of rotation.28 In hGRP-C crystallographic tetramer, monomers A and B as well as monomers C and D are related by a 2-fold noncrystallographic symmetric axis. This mode of tetramer association has not been found in any other galectins up to now. (a) Structure-based sequence alignment of human GRP-C and several known structural galectins, including hGal-1, hGal-2, hGal-3, hGal-7, and hGal-10. Secondary structural elements of human GRP-C are shown above the alignment (β, β-strand; η, 310-helix; TT, β-turn). Numbers above the sequences correspond to the hGRP-C sequence. The figure was produced with ESPript43 based on an alignment by Multalin.44 (b) Stereodiagram showing the superposition of five galectin homologs' Cα tracing. Proteins are colored as follows: hGRP-C, salmon; hGal-1, magenta; hGal-2, skyblue; hGal-3, green; hGal-7, cyan. (c) Size exclusion chromatography of protein hGRP-C with a single peak of a calculated molecular weight of ∼10.9 kDa. Primary sequence BLAST reveals that hGRP most resembles prototype subfamily galectins and shows comparatively high identity with human Gal-4 and Gal-7 (31% and 29%, respectively) among current identified galectins. However, it is notable that hGRP shows a serious sequence deviation for the highly conserved carbohydrate recognition segment (β-strands S4–S5). In particular, E50 and R70 replace the residues histidine and tryptophan, respectively, which are crucial for sugar-binding and are invariant in all galectins carrying conserved "CRDs." In addition, between S3 and S4, two proline residues are inserted which are absent in all other galectins [Fig. 2(a)]. This early evolutionary sequence divergence nicely reflects a functional differentiation between GRP and galectins, while, high sequence conservation among different species of GRP indicates that it is an ancient protein. To analyze the similarities of the three-dimensional structures of hGRP-C and galectins, we superposed the alpha carbons of the monomers of hGRP-C and hGal-1 (1GZW), hGal-2 (1HLC), hGal-3 (1A3K), and hGal-7 (1BKZ). The RMSD are 1.59 Å for 125 aligned Cα with hGal-1, 1.34 Å for 119 Cα with hGal-2, 1.25 E for 130 Cα with hGal-7, and 1.06 E for 127 Cα with hGal-3 [Fig. 2(b)]. This alignment result indicates that hGRP has a close relationship to galectin family. Carbohydrates recognition is the first step for lectins to mediate cell–cell or cell–matrix interactions. Previous investigations had clarified the sugar-binding mechanisms of galectins and they are similar in all known structural galectins. As a primary study, we first performed the lactose-binding test which showed that hGRP does not display any β-galactoside binding activity (data not shown). To seek additional evidence for the observation, we made hGRP-C cocrystallize with lactose and N-acetyl allolactosamine at 277 K. The obtained crystals were picked out for X-ray diffraction experiment, then the resulting data set was processed and the structure was subsequently determined. Conforming to former α-lactose-agarose binding test, the final refined electron density map does not comprise any bound carbohydrate substrates. To find out the reason of hGRP-C lacking carbohydrate recognition activity, we compared the "carbohydrate-binding cassettes" of hGRP-C and hGal-7. Former crystallographic investigations indicate that there are seven highly conserved amino acids playing key roles in carbohydrate recognition. In hGal-7-galactose complex, His49 makes a hydrogen bond with galactose O4, Asn51 makes a hydrogen bond with O4, Arg53 forms two hydrogen bonds with O4 and O5, Asn62 makes a hydrogen bond with O6, Glu72 forms two hydrogen bonds with O5 and O6. Try69 provides stacking interactions with the galactose moiety. Thr56, Glu58, Glu72, and Arg74 contribute to form a network of ionic interactions important for the optimal orientation of carbohydrates in the pocket.31 In hGRP-C, there are five substitutions among the seven key positions involved in sugar-binding. In particular, two substitutions are vital, valine instead of arginine at 54, arginine instead of tryptophan at 70. As an equivalent residue of hGal-7 Arg53, Val54 of hGRP-C cannot form hydrogen bonds with galactose O4 and O5. At the other vital position, Arg70 blocks the formation of the sugar-binding pocket and could not provide stacking interactions with bound galactose moiety. At amino acids 50 and 52, Glu and Lys could not provide a happy vacuum for the occupancy of galactosides as His and Asn, although they can make hydrogen bonds with galactose O4 too [Fig. 3(a)]. Comparison of the surfaces of hGRP-C and hGal-7, we found that the sugar-binding site architectures of the two proteins are completely different. Particularly, the protrudent residue Lys52 ruins the carbohydrate-binding pocket, leading to that it is impossible for the galactose to enter and orient in the groove correctly [Fig. 3(b)]. This structural basis discloses a fact that hGRP does not adopt the way of protein-galactoside recognition to perform its biological functions. (a) Superposition of the architectures of carbohydrate-recognition sites of human GRP-C and human galectin-7. hGRP-C and hGal-7 are colored orange and blue, respectively. hGRP-C residues E50, K52, V54, N63, R70, E73, and S75 involved in sugar-binding and equivalent residues of hGal-7 are shown as stick models and their carbon atoms are colored green and blue, respectively. Galactose carbon atoms are colored gray. (b) Comparison of carbohydrate-binding pockets of hGal-7 and hGRP-C. In the left diagram, the groove labeled by a black arrow is sugar-binding pocket of hGal-7. In the surface diagram of hGRP-C, the region equivalent to hGal-7 binding pocket is labeled by a black dashed pane. In summary, hGRP is a novel protein which has a close similarity to galectin sequences but not a member of the galectin family because of its lack of lactose-binding activity. The X-ray crystal structure of hGRP-C we report here reveals that hGRP-C has a typical galectin fold of 10-strand antiparallel β-sandwich. At the same time, the structure also reveals a significant difference of the architectures of sugar-binding site between hGRP-C and galectins. This suggests that hGRP would adopt a novel functional way which is highly different from those of galectins. This unknown mode needs to be found out in the future. We are grateful to Dr. Xiaohua Lou and Dr. Wei Zhao for their useful discussions in structure determination and also grateful to Dr. Shuilong Tong and Dr. Zhiqiang Zhu for their help in data collection.