Crystal structure of the copper homeostasis protein (CutCm) from Shigella flexneri at 1.7 Å resolution: The first structure of a new sequence family of TIM barrels
De-Yu Zhu,Yong-Qun Zhu,Ren-Huai Huang,Ye Xiang,Na Yang,Hongxia Lü,Genpei Pei Li,Qi Jin,Da-Cheng Wang
Abstract:Copper is an essential heavy metal trace element to organisms, and organisms have obtained their own copper homeostasis mechanisms in their evolution.1 Two types of gene families appear to be associated with the copper homeostasis in bacteria. One is thecop genes family, which is a well-understood system of active transport efflux pumps, and the other is the cut gene family, which has 6 cut gene members (cutA, cutB, cutC, cutD, cutE, cutF).2, 3 So far, the detailed functions of the proteins produced by the cut genes are not very clear. The putative copper homeostasis protein (CutCm) produced by the cutCm gene of Shigella flexneri 2a str. 301 consists of 248 residues and belongs to the CutC family (Pfam-PF03932) (Fig. 1).4, 5 Some studies have implicated that CutCm may play a role in intracellular trafficking of Cu(I),8, 9 but the actual function of CutCm is still unknown. Secondary structure of CutCm of S. flexneri and sequence alignment with its homologous proteins of CutC family. The numbering above the alignment corresponds to S. flexneri CutCm. The secondary-structure elements (β-sheets, α-helices) found in S. flexneri CutCm are shown above the sequences. The box represents the missing residues in the crystal structure of CutCm owing to lack of electron density. Conserved residues are shaded light gray and the highly conserved residues are dark gray. Seven aligned homologous proteins come from Pfam5 (the names of proteins are from the GenBank accession ID): AAF84150 (Xylella fastidiosa 9a5c), AAF93895 (Vibrio cholerae O1 biovar eltor str. N16961), AAK06092 (Lactococcus lactis subsp. lactis Il1403), AAK24331 (Caulobacter crescentus CB15), AAK33440 (Streptococcus pyogenes M1 GAS), AAK02610 (Pasteurella multocida subsp. multocida str. Pm70), and BAB48567 (Mesorhizobium loti). Sequence alignment was performed with CLUSTALW6 and the figure was prepared with ALSCRIPT.7 Here we report the crystal structure of CutCm determined using multiple-wavelength anomalous dispersion (MAD) method, which is the first three-dimensional (3D) structure of CutC family. The tertiary structure of CutCm adopts a classical triosephosphate isomerase (TIM) barrel fold.10 The structure and sequence comparisons show that the CutCm structure is the representative of a new sequence family of TIM barrels. Polymerase chain reaction (PCR) primers, including NdeI and XhoI restriction sites, were designed to amplify the cutCm gene (GenBank: NP_707762) from the genomic DNA of S. flexneri 2a strl 301.4 The gene was inserted into the pET22b(+) vector (Novagen), leading to a C-terminal His-tagged protein. The protein was overexpressed in Escherichia coli BL21-Codon Plus(DE3)-RIL (Stratagene). The cells were grown in Luria–Bertani (LB) medium at 37°C and induced with 1 mM isopropyl-β-D-thiogalactopyranoside (IPTG) when the culture reached an OD600 of 0.6–0.8. The harvested cells were resuspended in 30 mL lysis buffer (50 mM NaH2PO4, pH8.0; 300 mM NaCl; 10 mM imidazole) with 0.1 mM phenyl methyl sulfonyl fluoride (PMSF) and disrupted using FRENCH Pressure Cell. The lysate was centrifugated at 12,500 rpm at 4°C for 20 min. The supernatant was loaded directly into a Ni-nitriloacetic acid (Ni-NTA) column (Novagen) pre-equilibrated with lysis buffer. Then the column was washed with wash buffer (50 mM NaH2PO4, pH8.0; 300 mM NaCl; 20 mM imidazole) and the target protein was eluted with elution buffer (50 mM NaH2PO4, pH 8.0; 300 mM NaCl; 250 mM imidazole). The eluate was concentrated by ultrafiltration (Millipore) and then loaded into Superdex75 HR16/60 column (Amersham Pharmacia) pre-equilibrated with buffer of 50 mM NH4HCO3 at 20°C. Further purification was performed with a global Mono Q HR5/5 (Amersham Pharmacia). Finally, CutCm was desalted using the buffer of 50 mM Tris-HCl, pH 8.0, and concentrated for crystallization. In order to produce the selenomethionine (SeMet) substituted derivate, the method of inhibition to methionine metabolism pathway was used.11 The transformed BL21-Codon Plus(DE3)-RIL cells were grown in minimal medium at 37°C. When OD600 of the culture reached 0.6–0.8, solid amino acid supplements (Lys, Phe, Thr, Ile, Leu, Val, SeMet) were added to the culture. After 15 min, the expression was induced as usual. The purification of SeMet protein was performed using the same procedure as the described above for the native protein. Additionally, the reducing reagent β-mercaptoethanol (10 mM) was added to all purification buffers except the final buffer (50 mM Tris-HCl, pH 8.0) for desalting. The crystallization experiments were conducted using the hanging-drop vapor-diffusion method at 20°C, with 2 μL drops containing a 50:50 (v/v) mixture of reservoir solution and protein solution (10 mg/mL) equilibrated against 0.5 mL reservoir solution. The reservoir solution for native crystal growth was 20% polyethylene glycol (PEG8000), 0.1 M sodium cacodylate, pH 6.0, and 0.15 M calcium acetate. The qualified native crystals were obtained by successive microseeding and macroseeding. The qualified SeMet derivative crystals were obtained by a combined procedure of cross-seeding (the seeds come from the native crystal), microseeding, and macroseeding,12 and the reservoir solution was 20% PEG8000, 0.1 M N-2-hydroxyethylpiperazine-N′-2-ethanesulfonic acid (HEPES)-Na, pH 7.0, and 0.1 M calcium acetate. All diffraction data were collected at the beamline 6A (BL6A) at the Photon Factory (Tsukuba, Japan) using an ADSC Quantum-4 charge-coupled device (CCD) detector. All crystals were briefly soaked in the paraffin oil (Hampton Research) after being mounted in nylon cryoloops (Hampton Research) and then flash-cooled in a nitrogen-gas stream at 95 K. The native data were collected at a wavelength of 0.9780 Å with a 1.7 Å resolution. The 3-wavelength MAD data diffracted up to 2.1 Å (peak: 0.97850 Å, inflection: 0.97938 Å, high-energy remote: 0.9700 Å) were collected from a SeMet derivative crystal. The peak and inflection point wavelengths were determined by recording an X-ray fluorescence spectrum. The native and SeMet derivative data were integrated using MOSFLM13 and scaled using SCALA.13 Data statistics are summarized in Table I. The structure of CutCm was determined using the MAD method. Twenty-two possible selenium sites were found by SHELXD.16 The best experimental phases for the 20 reasonable selenium sites (2 unreasonable selenium sites were removed) were calculated by SHARP.17 After phasing, density modification was achieved by density modificiation (DM).18 The initial model consisting of 85% of the structure of CutCm was built automatically by using ARP/wARP.13, 19 Regions not constructed were manually built using O.20 The model was refined initially using the Crystallography & NMR System (CNS) version 1.121 against the data recorded at the peak wavelength and restrained by the experimental phases. With the Rfree dropping below 30%, the model was refined against the 1.7 Å native data without phase restraints. The final model was manually achieved by using O.20 The final R factor was 19.1% with Rfree of 22.2%. The model quality was checked using PROCHECK15 (Table I). The final native structure model of CutCm includes 2 protein molecules (residues 1–210 and 223–248 for chain A, residues 1–211 and 224–250 for chain B), 547 water molecules, and 1 Ca2+ atom. In addition, 1 disulfide bond is formed between the Cys8 and the Cys13 in each molecule. Some residues were missed in the model due to lack of electron density, which included the hexa-histine tag and 12 residues for each molecule (Asn A211 to Ala A222, and Gln B212 to Ala B223 for chains A and B, respectively), in addition to 2 residues (Leu A249, Glu A250) for chain A. The 2 molecules in the asymmetric unit are related by a noncrystallographic 2-fold axis [Fig. 2(a)]. They form a dimer with a buried surface area of 3688 Å2 per monomer and 60 hydrogen bonds in between [root-mean-square deviation (RMSD) 0.49Å for their Cα atoms]. (a) Ribbon diagram of the CutCm dimer. Ribbons are green and orange for molecules A and B, respectively. (b) The ribbon drawing of the CutCm molecule A structure. α-helices and 310-helices are shown in orange and magenta, respectively; β1–β8 and β9–β10 outside the barrel are shown in yellow and cyan, respectively. (c) The 10 charged residues inside the TIM barrel are presumed to be functionally important. The oxygen and nitrogen atoms are red and blue, respectively, and the hydrogen bonds are cyan. The figures were prepared with MOLSCRIPT.22 The CutCm monomer structure adopts a common TIM barrel (β/α)8.10 It is composed of 10 β-strands (β1–β10) and 10 helices (α1–α10), where the 8 β-strands (β1–β8) are correspondingly surrounded by 8 helices (α1, α3–α5, and α7–α10), while α2 and α6 are 310-helices [Figs. 1 and 2(b)]. Inside the barrel, there are 10 charged residues (Glu A5, Arg A24, Glu A26, His A55, Arg A59, His A121, Arg A122, Arg A144, Glu A194, and His A196). Additionally, 6 hydrogen bonds are formed between 6 of the 10 residues (A5 OE2…NE A24, 2.72 Å; A5 OE2…NH1 A24, 3.11 Å; A26 OE2…NE2 A196, 3.31 Å; A194 OE1…NH1 A24, 2.84 Å; A194 OE1…NE A144, 2.75 Å; A194 OE2…NH1 A144, 2.98 Å) [Fig. 2(c)]. At the N-terminal end of the barrel, all of the turns located between the α-helices and the subsequent β-strands are composed of only 3 or 4 residues except the loop (residues 167–172) between α8 and β7, whose length is uncertain owing to disorder. At the C-terminal end of the barrel, the long regions between the β-strands and the subsequent α-helices correspondingly include two 310-helices (α2, α6) [Fig. 2(b)]. Furthermore, the longer region between β8 and α10 was composed of 35 residues (A197–A231), which included 2 β-strands (β9, β10) and 12 missing residues. β9 and β10 are nearly parallel to the TIM barrel axis and are shown as the short handle of the TIM barrel. So far, over 1800 TIM barrel domains have been found, which are classified into 29 homologous superfamilies and 139 sequence families (CATH version 2.5.1).23, 24 As expected, the combinatorial extension (CE) search25 revealed that the relatively highly similar structures of CutCm all belong to the TIM barrel family. The PSI-BLAST26 search indicated that there is no obvious sequence similarity between CutCm and all other structures of the TIM barrel family in the Protein Data Bank (PDB). For detecting more distant homologues and analogues of CutCm, the structure comparisons between the CutCm and the representatives of 29 homologous superfamilies of TIM barrel family were performed using the SSAP program.27 The results showed that the closest structural homologue was phosphoribosyl anthranilate isomerase from Thermotoga maritima28 (tPRAI; PDB ID: 1NSJ; SSAP score, 81.89%; overlap, 82%; RMSD, 3.05 Å for equivalent 194 Cα atoms; sequence identity, 14% for 205 aligned residues). The main differences between the 2 structures are located at the C-terminal end of the barrel, where the CutCm is more expansible and especially stretches out 2 additional β-strands (β9, β10). According to the classification described by Nagano et al.29 and the CATH levels,24 the CutCm and tPRAI belong to the same homologous superfamily of TIM barrel, triose phosphate isomerase, flavin mononucleotide (FMN)-dependent oxidoreductases, phosphate-binding enzymes, and tryptophan biosynthesis enzymes homologous superfamily (CATH Code: And within the homologous superfamily, there are 24 sequence families ( Compared with their sequences using SSAP again, CutCm has less than 20% sequence identity to each representative member of the 24 sequence families for at least 60% residues of the smaller aligned protein. According to CATH criteria of S-level classification, CutCm should be assigned to a new sequence family of the homologous superfamily of TIM barrels. The CutCm structure represents the first structure of the CutC family. So far, the specific function of CutCm is still not clear; the tertiary structure of CutCm reported in this article provides a sound basis for the in-depth study of its structure–function relationship. According to the functional character of common TIM barrel proteins, the loops at the C-terminal end of barrel are very important for their functions.10 Mapping the conserved sequence motif region of CutC family (Fig. 1) onto the crystal structure of CutCm, one can find that the loop between β3 and α4 at the C-terminal end of the barrel, and 8 of 10 charge residues (Glu A5, Arg A24, Glu A26, His A121, Arg A144, Glu A194, His A196, and Arg A59) inside the barrel are conserved. It is plausible to speculate that the loop between β3 and α4 may be a part of the active site, and the polar property at the interior of the barrel is necessary for the function of CutCm. Coordinates and structure factors for the structure of CutCm have been deposited at the PDB, Research Collaboratory for Structural Bioinformatics (RCSB,, with accession code 1X7I. We thank Professor N. Sakabe for his help during data collection in Photon Factory in Japan, and Professor Jiang F. for providing the in-house X-ray facility for preliminary X-ray analysis.