

FFF | Fragment-Guided Flexible Fitting for Cryo-EM
Authors:
Chen Weijie, Wang Xinyan, Wang Yuhang
Docker image:fff-notebook:v0.2.3
Node type:c16_m62_1 Nvidia T4 (upgradable)
Date:2023-07-31
Copyright 2023 @ Authors
Quick Start: Click the button above to start connecting to (by default, the fff-notebook:v0.2.3 image is used), and it will run after a while. If you encounter any problems, please contact bohrium@dp.tech .
Tip: Running this notebook requires the use of non-free computing resources
containing a T4 graphics card
Cryo-electron microscopy technology for structural analysis
Cryo-electron microscopy (Cryo-EM) is an advanced imaging technology in biology. In recent years, it has become an important tool for analyzing the structure of biomolecules, especially for studying large biomolecular complexes and membrane protein structures. The working principle of Cryo-EM is to freeze the biological sample, so that the biomolecules can maintain their natural state under low temperature conditions. Then, high-energy electron beams are used to transmit the sample, collect transmission electron images, and finally obtain the high-resolution three-dimensional structure of the biomolecules through computer image processing and three-dimensional reconstruction. Because Cryo-EM avoids the tedious process of preparing crystals required by traditional X-ray crystallography, it has attracted widespread attention.
Although Cryo-EM technology plays a huge role in biomedical structure research, there are still challenges in the last step of building the full atomic model structure. Firstly, Cryo-EM images themselves have the problem of low signal-to-noise ratio, which is caused by electron beam radiation damage to the sample, as well as the variety of conformations of biomolecules. This limits the resolution and thus affects the accuracy of the atomic model. In addition, in converting the electron density map reconstructed by Cryo-EM into a full atomic model, it still relies on prior biological information, templates, and model optimization. The accuracy and reliability of these methods largely determine the quality of the final atomic model. However, in some cases, such as a lack of available templates or unknown new structures, these methods may be limited.
Cryo-electron microscopy all-atom model structure construction
The ultimate goal of cryo-EM structure analysis is to analyze the all-atomic structure of the target macromolecule through the images observed in the experiment. At present, there are many methods for the construction of atomic models, which can be divided into two categories: manual modeling and automatic modeling. These two methods each have different advantages and disadvantages.
Manual modeling: Using density maps, researchers can manually construct atomic models of biomacromolecules on a graphical interface (such as using the software COOT). This method has a high degree of freedom, especially when the details of the density map are not obvious, and the researcher can make inferences based on known chemical information and experience. However, the manual modeling process is time-consuming and the results are affected by the researcher's experience and judgment.
Automatic modeling: Automatic modeling software (such as Phenix, Rosetta, ARP/wARP, MDFF, etc.) can automatically generate atomic models based on density maps. This method has high efficiency and consistency, and reduces the interference of human factors. However, the accuracy of automated modeling may be limited in cases where the density map resolution is low or the model complexity is high.
In order to solve the drawbacks in existing solutions, we propose a new method FFF ("Fragment-guided Flexible Fitting") [[5]](https://openaccess.thecvf.com/content/CVPR2023/html/Chen< _>FFF_Fragment-Guided_Flexible_Fitting_for_Building_Complete_Protein_Structures_CVPR_2023<_>paper. html), which can construct more accurate and complete protein structures from cryo-EM experimental data. FFF achieves more reliable cryo-EM structure modeling by combining protein structure prediction and protein structure identification with flexible fitting algorithms.

FFF tutorial
Next, we use a real case to demonstrate the effect of the FFF algorithm.
ASCT2: A multi-conformational protein
ASCT2 is a type of transporter protein. Due to functional requirements, this protein has multiple stable conformations. There are two main conformations. One conformation is open toward the inside of the cell (6RVX) [[1]](https://doi.org /10.1038/s41467-019-11363-x), and the other is open toward the outside of the cell (7BCQ)[[2]](https:/ /doi.org/10.1073/pnas.210409311),. In this case, we will build the all-atom model structure of the first conformation based on the 7BCQ density map. Shown below is the published structure of this conformation [1].


AlphaFold2 predicted structure
The following is the structure predicted by the AlphaFold2 algorithm [3], which shows an inwardly opened structure, which is very different from our target structure.
TM score (AlphaFold) TM-score = 0.5766 (d0= 7.54)


Traditional electron microscope structure construction method
MDFF (Molecuar Dynamics Flexible Fitting) is a traditional method of cryo-EM structure construction[4], let's try it in this case The structural construction effect on the.
It can be seen from the results below that MDFF cannot well build a three-dimensional atomic model structure that conforms to the density map. This is mainly because MDFF can easily fall into a local optimal solution when the initial structure and the target structure are very different.


Use FFF algorithm
Finally, let's try to use FFF to automatically build the all-atom model structure of ASCT2.

Variable definitions
1. Density map recognition
We first need to convert the input density map into a standard density map (pixel size is 1 Å) to ensure that the input density map and the density map used for model training are consistent in voxel size. In addition, we also need to generate a variance plot.
['7BCQ.apix1.ccp4', '7BCQ.apix1_apix_map.mrc', '7BCQ.apix1_res_3.0.dx', '7BCQ.apix1_std_map.mrc', '7BCQ_clean.pdb', '7BCQ_clean_chain.pdb', '7BCQ_clean_clean.pdb', '7BCQ_clean_no_hetero.pdb', '7BCQ_cmd.dcd', '7BCQ_cmd.pdb', '7BCQ_cmd.rst', '7BCQ_cmd.tmscore.txt', '7BCQ_cmd_cmd_config.yml', '7BCQ_infer.cif', '7BCQ_infer.pdb', '7BCQ_infer.txt', '7BCQ_infer_backbone.mrc', '7BCQ_restr.exb', '7BCQ_restr_config.yml', '7BCQ_tmd.pdb', '7BCQ_tmd.tmscore.txt', '7BCQ_tmd_raw.pdb', '7BCQ_tmd_tmd.dcd', '7BCQ_tmd_tmd.rst', '7BCQ_tmd_tmd_config.yml', '7bcq_fff.dcd', '7bcq_fff.pdb']
3D fragment structure prediction
We can now identify the density map and generate several protein fragments. Fragment identification from the density map relies on a lot of information, including the probability, position and amino acid type of atoms, as well as the pseudopeptide vector.
/opt/conda/envs/dpemm/lib/python3.9/site-packages/torch/nn/functional.py:3704: UserWarning: nn.functional.upsample is deprecated. Use nn.functional.interpolate instead. warnings.warn("nn.functional.upsample is deprecated. Use nn.functional.interpolate instead.") (64, 64, 64) -> (64, 64, 64) 0 32 match domain from given fasta: /demo/fasta/7BCQ.fasta num_residue: 236 num_domain: 18 domain mean length: 13.11111111111111 /data/fff_demo/output/7BCQ_infer.txt /data/fff_demo/output/7BCQ_infer.cif /data/fff_demo/output/7BCQ_infer.pdb /data/fff_demo/output/7BCQ_infer_backbone.mrc fff infer --output-txt /data/fff_demo/output/7BCQ_infer.txt --output-cif /data/fff_demo/output/7BCQ_infer.cif --output-pdb /data/fff_demo/output/7BCQ_infer.pdb --input-config /ckpt/train_config.json --input-weights /ckpt/fffw_304000.pt --input-raw-map /data/fff_demo/output/7BCQ.apix1_apix_map.mrc --input-std-map /data/fff_demo/output/7BCQ.apix1_std_map.mrc --input-fasta /demo/fasta/7BCQ.fasta --output-backbone-map /data/fff_demo/output/7BCQ_infer_backbone.mrc --confidence 0.3 --length-cutoff 2 --device 0
['7BCQ.apix1.ccp4', '7BCQ.apix1_apix_map.mrc', '7BCQ.apix1_res_3.0.dx', '7BCQ.apix1_std_map.mrc', '7BCQ_clean.pdb', '7BCQ_clean_chain.pdb', '7BCQ_clean_clean.pdb', '7BCQ_clean_no_hetero.pdb', '7BCQ_cmd.dcd', '7BCQ_cmd.pdb', '7BCQ_cmd.rst', '7BCQ_cmd.tmscore.txt', '7BCQ_cmd_cmd_config.yml', '7BCQ_infer.cif', '7BCQ_infer.pdb', '7BCQ_infer.txt', '7BCQ_infer_backbone.mrc', '7BCQ_restr.exb', '7BCQ_restr_config.yml', '7BCQ_tmd.pdb', '7BCQ_tmd.tmscore.txt', '7BCQ_tmd_raw.pdb', '7BCQ_tmd_tmd.dcd', '7BCQ_tmd_tmd.rst', '7BCQ_tmd_tmd_config.yml', '7bcq_fff.dcd', '7bcq_fff.pdb']
Fragment prediction effect
Shown below is the predicted fragment structure.

Shown is the comparison between the backbone density map (gray) predicted by FFF and the input density map (light blue). The backbone density map represents the probability that each voxel belongs to the backbone atom (C, C, N).
The subsequent process of using density map fitting uses input density map by default, and users can also use backbone density map for fitting.

2. Protein full atomic structure construction
After having the predicted protein fragments, we use these fragments as structural constraints to build a complete protein structure. The flow chart of this part of the algorithm is shown below.

Initial protein structure and processing
The protein structure files we get usually have a lot of missing information (hydrogen atoms, side chains of certain residues, etc.). We first need to repair the input initial structure.
From the results below, it can be clearly seen that the results of FFF construction are very close to the target structure, which can basically meet the needs of cryo-EM structure construction.
Finding missing atoms... Adding missing atoms... Writing output... Done. Load PDB... Done. Re-organize chain id... Done. Warning: importing 'simtk.openmm' is deprecated. Import 'openmm' instead. Finding missing residues... Finding nonstandard residues... Replacing nonstandard residues... Finding missing atoms... Adding missing atoms... Adding missing hydrogens... Writing output... Done. Find GB force Minimization... Before: 437100.15625 kJ/mol -20948833.0407383 kJ/(nm mol) After: -54571.84375 kJ/mol -1248.0090580619872 kJ/(nm mol) Done.
Generation of grid file & structural constraint file
Next we need to prepare the grid files and structural constraint files required for dynamic simulation.
dpems grid --input /data/fff_demo/output/7BCQ.apix1.ccp4 --output /data/fff_demo/output/7BCQ.apix1_res_3.0.dx --rinp 3 --rout 3.0 >> input is a ccp4 file: /data/fff_demo/output/7BCQ.apix1.ccp4 >> origin from /data/fff_demo/output/7BCQ.apix1.ccp4: [55.65999806 86.019997 72.86399746] GRID SIZE: 65 x 65 x 65 >> output grid origin [55.65999806 86.019997 72.86399746]
dpems optstruc --configure /data/fff_demo/output/7BCQ_restr_config.yml PLATFORM: CUDA Writing SSrestraint Writing CHIRALrestraint Writing CISrestraint restraint file: /data/fff_demo/output/7BCQ_restr.exb
Structure Fitting
The next step we need to do is to fit the initial structure to the predicted fragment structure. At the same time, we will also use the grid file converted from the density map to guide the entire process of structure fitting.
dpems tmd --init-pdb /data/fff_demo/output/7BCQ_clean.pdb --restraint /data/fff_demo/output/7BCQ_restr.exb --coupling-config /data/fff_demo/output/7BCQ_tmd_tmd_config.yml --output-restart /data/fff_demo/output/7BCQ_tmd_tmd.rst --output-dcd /data/fff_demo/output/7BCQ_tmd_tmd.dcd --output-pdb /data/fff_demo/output/7BCQ_tmd_raw.pdb --output-pdb-aligned /data/fff_demo/output/7BCQ_tmd.pdb --temperature 10 --nsteps 12000 --traj-freq 1000 --report-freq 1000 --tmd-update-freq 1000 --platform CUDA @> 236 atoms and 1 coordinate set(s) were parsed in 0.00s. @> 6701 atoms and 1 coordinate set(s) were parsed in 0.06s. @> 6701 atoms and 1 coordinate set(s) were parsed in 0.06s. @> 236 atoms and 1 coordinate set(s) were parsed in 0.00s. >>> 236 atoms selected for TMD >>> initial RMSD: 8.48 A Stage 1: gamma = 0.92 #"Step","Potential Energy (kJ/mole)","Temperature (K)","Density (g/mL)","Speed (ns/day)","Time Remaining" 1000,-55395.15953086443,10.935697480964484,9.625561292924427,0,-- @> 6701 atoms and 1 coordinate set(s) were parsed in 0.06s. @> 236 atoms and 1 coordinate set(s) were parsed in 0.00s. @> 236 atoms and 1 coordinate set(s) were parsed in 0.00s. @> 6701 atoms and 1 coordinate set(s) were parsed in 0.06s. [stage 1] >>> save /data/fff_demo/output/7BCQ_tmd_raw.pdb [stage 1] >>> save /data/fff_demo/output/7BCQ_tmd.pdb [stage 1] >>> rmsd: 7.80 A (gamma = 0.9166666666666666) Stage 2: gamma = 0.83 2000,-55288.67541526787,12.075340646822927,9.625561292924427,69.1,0:25 @> 6701 atoms and 1 coordinate set(s) were parsed in 0.06s. @> 236 atoms and 1 coordinate set(s) were parsed in 0.00s. @> 236 atoms and 1 coordinate set(s) were parsed in 0.00s. @> 6701 atoms and 1 coordinate set(s) were parsed in 0.06s. [stage 2] >>> save /data/fff_demo/output/7BCQ_tmd_raw.pdb [stage 2] >>> save /data/fff_demo/output/7BCQ_tmd.pdb [stage 2] >>> rmsd: 7.13 A (gamma = 0.8333333333333334) Stage 3: gamma = 0.75 3000,-55083.09274333183,11.651242182053384,9.625561292924427,71.2,0:21 @> 6701 atoms and 1 coordinate set(s) were parsed in 0.06s. @> 236 atoms and 1 coordinate set(s) were parsed in 0.00s. @> 236 atoms and 1 coordinate set(s) were parsed in 0.00s. @> 6701 atoms and 1 coordinate set(s) were parsed in 0.06s. [stage 3] >>> save /data/fff_demo/output/7BCQ_tmd_raw.pdb [stage 3] >>> save /data/fff_demo/output/7BCQ_tmd.pdb [stage 3] >>> rmsd: 6.46 A (gamma = 0.75) Stage 4: gamma = 0.67 4000,-54961.81337345173,12.08808337728884,9.625561292924427,72.6,0:19 @> 6701 atoms and 1 coordinate set(s) were parsed in 0.06s. @> 236 atoms and 1 coordinate set(s) were parsed in 0.00s. @> 236 atoms and 1 coordinate set(s) were parsed in 0.00s. @> 6701 atoms and 1 coordinate set(s) were parsed in 0.06s. [stage 4] >>> save /data/fff_demo/output/7BCQ_tmd_raw.pdb [stage 4] >>> save /data/fff_demo/output/7BCQ_tmd.pdb [stage 4] >>> rmsd: 5.77 A (gamma = 0.6666666666666667) Stage 5: gamma = 0.58 5000,-54775.03003960011,12.387954444093356,9.625561292924427,72.7,0:16 @> 6701 atoms and 1 coordinate set(s) were parsed in 0.06s. @> 236 atoms and 1 coordinate set(s) were parsed in 0.00s. @> 236 atoms and 1 coordinate set(s) were parsed in 0.00s. @> 6701 atoms and 1 coordinate set(s) were parsed in 0.06s. [stage 5] >>> save /data/fff_demo/output/7BCQ_tmd_raw.pdb [stage 5] >>> save /data/fff_demo/output/7BCQ_tmd.pdb [stage 5] >>> rmsd: 5.05 A (gamma = 0.5833333333333333) Stage 6: gamma = 0.50 6000,-54542.08228012238,11.673589617894702,9.625561292924427,73.3,0:14 @> 6701 atoms and 1 coordinate set(s) were parsed in 0.06s. @> 236 atoms and 1 coordinate set(s) were parsed in 0.00s. @> 236 atoms and 1 coordinate set(s) were parsed in 0.00s. @> 6701 atoms and 1 coordinate set(s) were parsed in 0.06s. [stage 6] >>> save /data/fff_demo/output/7BCQ_tmd_raw.pdb [stage 6] >>> save /data/fff_demo/output/7BCQ_tmd.pdb [stage 6] >>> rmsd: 4.34 A (gamma = 0.5) Stage 7: gamma = 0.42 7000,-54390.491832383756,12.691338374008142,9.625561292924427,73.3,0:11 @> 6701 atoms and 1 coordinate set(s) were parsed in 0.06s. @> 236 atoms and 1 coordinate set(s) were parsed in 0.00s. @> 236 atoms and 1 coordinate set(s) were parsed in 0.00s. @> 6701 atoms and 1 coordinate set(s) were parsed in 0.06s. [stage 7] >>> save /data/fff_demo/output/7BCQ_tmd_raw.pdb [stage 7] >>> save /data/fff_demo/output/7BCQ_tmd.pdb [stage 7] >>> rmsd: 3.62 A (gamma = 0.41666666666666663) Stage 8: gamma = 0.33 8000,-54243.49506762587,12.286542322856695,9.625561292924427,73.4,0:09 @> 6701 atoms and 1 coordinate set(s) were parsed in 0.06s. @> 236 atoms and 1 coordinate set(s) were parsed in 0.00s. @> 236 atoms and 1 coordinate set(s) were parsed in 0.00s. @> 6701 atoms and 1 coordinate set(s) were parsed in 0.06s. [stage 8] >>> save /data/fff_demo/output/7BCQ_tmd_raw.pdb [stage 8] >>> save /data/fff_demo/output/7BCQ_tmd.pdb [stage 8] >>> rmsd: 2.92 A (gamma = 0.33333333333333337) Stage 9: gamma = 0.25 9000,-54145.08429337973,12.500200522709907,9.625561292924427,73.6,0:07 @> 6701 atoms and 1 coordinate set(s) were parsed in 0.06s. @> 236 atoms and 1 coordinate set(s) were parsed in 0.00s. @> 236 atoms and 1 coordinate set(s) were parsed in 0.00s. @> 6701 atoms and 1 coordinate set(s) were parsed in 0.06s. [stage 9] >>> save /data/fff_demo/output/7BCQ_tmd_raw.pdb [stage 9] >>> save /data/fff_demo/output/7BCQ_tmd.pdb [stage 9] >>> rmsd: 2.22 A (gamma = 0.25) Stage 10: gamma = 0.17 10000,-53847.063096414015,12.377334237655072,9.625561292924427,73.9,0:04 @> 6701 atoms and 1 coordinate set(s) were parsed in 0.06s. @> 236 atoms and 1 coordinate set(s) were parsed in 0.00s. @> 236 atoms and 1 coordinate set(s) were parsed in 0.00s. @> 6701 atoms and 1 coordinate set(s) were parsed in 0.06s. [stage 10] >>> save /data/fff_demo/output/7BCQ_tmd_raw.pdb [stage 10] >>> save /data/fff_demo/output/7BCQ_tmd.pdb [stage 10] >>> rmsd: 1.62 A (gamma = 0.16666666666666663) Stage 11: gamma = 0.08 11000,-53416.41181881514,12.860831477990114,9.625561292924427,73.9,0:02 @> 6701 atoms and 1 coordinate set(s) were parsed in 0.06s. @> 236 atoms and 1 coordinate set(s) were parsed in 0.00s. @> 236 atoms and 1 coordinate set(s) were parsed in 0.00s. @> 6701 atoms and 1 coordinate set(s) were parsed in 0.06s. [stage 11] >>> save /data/fff_demo/output/7BCQ_tmd_raw.pdb [stage 11] >>> save /data/fff_demo/output/7BCQ_tmd.pdb [stage 11] >>> rmsd: 1.01 A (gamma = 0.08333333333333337) Stage 12: gamma = 0.00 12000,-52730.25720070057,14.061296279725285,9.625561292924427,73.9,0:00 @> 6701 atoms and 1 coordinate set(s) were parsed in 0.06s. @> 236 atoms and 1 coordinate set(s) were parsed in 0.00s. @> 236 atoms and 1 coordinate set(s) were parsed in 0.00s. @> 6701 atoms and 1 coordinate set(s) were parsed in 0.06s. [stage 12] >>> save /data/fff_demo/output/7BCQ_tmd_raw.pdb [stage 12] >>> save /data/fff_demo/output/7BCQ_tmd.pdb [stage 12] >>> rmsd: 0.57 A (gamma = 0.0)
{'cmd_k': 10000.0, 'cmd_selection': 'name CA', 'cmd_steps_per_stage': 500, 'cmd_total_stages': 2, 'gpu_device': 0, 'input_pdb_init': '/data/fff_demo/output/7BCQ_tmd.pdb', 'input_pdb_target': '/data/fff_demo/output/7BCQ_infer.pdb', 'input_restr': '/data/fff_demo/output/7BCQ_restr.exb', 'output_dcd': '/data/fff_demo/output/7BCQ_cmd.dcd', 'output_pdb': '/data/fff_demo/output/7BCQ_cmd.pdb', 'output_rst': None, 'platform': 'CUDA', 'report_freq': 500, 'temperature': 10.0, 'total_steps': 5000, 'traj_freq': 500} dpems cmd --input-pdb /data/fff_demo/output/7BCQ_tmd.pdb --coupling-config /data/fff_demo/output/7BCQ_cmd_cmd_config.yml --output-restart /data/fff_demo/output/7BCQ_cmd.rst --output-dcd /data/fff_demo/output/7BCQ_cmd.dcd --output-pdb /data/fff_demo/output/7BCQ_cmd.pdb --temperature 10.0 --total-steps 5000 --traj-freq 500 --report-freq 500 --cmd-total-stages 2 --cmd-steps-per-stage 500 --platform CUDA --debug --restraint /data/fff_demo/output/7BCQ_restr.exb @> 236 atoms and 1 coordinate set(s) were parsed in 0.00s. @> 6701 atoms and 1 coordinate set(s) were parsed in 0.07s. CREATE BIAS USING MAP: /data/fff_demo/output/7BCQ.apix1_res_3.0.dx INPUTMAP: 64 x 64 x 64 CREATEMAP: 64 x 64 x 64 >>> MDFF biases: [<openmm.openmm.CustomCompoundBondForce; proxy of <Swig Object of type 'OpenMM::CustomCompoundBondForce *' at 0x7f23b196ec90> >] >>> All biases: [<openmm.openmm.CustomExternalForce; proxy of <Swig Object of type 'OpenMM::CustomExternalForce *' at 0x7f23b1945630> >, <openmm.openmm.CustomCompoundBondForce; proxy of <Swig Object of type 'OpenMM::CustomCompoundBondForce *' at 0x7f23b196ec90> >] >>> Add restraints (SS, cis, chiral) using "/data/fff_demo/output/7BCQ_restr.exb" CMD Stage 1: gamma: 0.500 (236 atoms restrained) #"Step","Potential Energy (kJ/mole)","Temperature (K)","Density (g/mL)","Speed (ns/day)","Time Remaining" 500,-97090.2734375,11.759734960739115,9.625561292924427,0,-- @> 236 atoms and 1 coordinate set(s) were parsed in 0.00s. @> 6701 atoms and 1 coordinate set(s) were parsed in 0.06s. >>> RMSD: 0.72 Å >>> atom 82: xyz=[ 6.894629 11.00670433 8.39646816] nm; sys: xyz=[ 6.8741 11.0359 8.3503] nm; ref: xyz=[ 6.9458 10.9505 8.478 ] nm; CMD Stage 2: gamma: 1.000 (236 atoms restrained) 1000,-95746.671875,13.16433019533872,9.625561292924427,70.8,0:09 @> 236 atoms and 1 coordinate set(s) were parsed in 0.00s. @> 6701 atoms and 1 coordinate set(s) were parsed in 0.06s. >>> RMSD: 0.65 Å >>> atom 82: xyz=[ 6.92476416 10.98426247 8.44348335] nm; sys: xyz=[ 6.8741 11.0359 8.3503] nm; ref: xyz=[ 6.9458 10.9505 8.478 ] nm; >>> Run MD with constraints (4000 steps to go; 236 atoms restrained) 1500,-96080.484375,12.115721037961649,9.625561292924427,70.2,0:08 2000,-96188.5859375,10.847192694100114,9.625561292924427,97.9,0:05 2500,-96217.4140625,10.104542208772175,9.625561292924427,122,0:03 3000,-96229.640625,10.122088472905846,9.625561292924427,143,0:02 3500,-96230.3203125,10.066886989533359,9.625561292924427,162,0:01 4000,-96215.984375,9.754862200097236,9.625561292924427,176,0:00 4500,-96229.625,9.944912422566823,9.625561292924427,191,0:00 5000,-96230.578125,9.86418352907686,9.625561292924427,204,0:00 @> 236 atoms and 1 coordinate set(s) were parsed in 0.00s. @> 6701 atoms and 1 coordinate set(s) were parsed in 0.07s. >>> RMSD: 0.65 Å >>> atom 82: xyz=[ 6.9252367 10.98064423 8.4386816 ] nm; sys: xyz=[ 6.8741 11.0359 8.3503] nm; ref: xyz=[ 6.9458 10.9505 8.478 ] nm; Done! >>> Total number of steps: 5000 >>> output pdb: /data/fff_demo/output/7BCQ_cmd.pdb Time elapsed: 12.634897708892822 s
3. Comparison of predicted and published structures
Finally we compare how far the predicted and published structures differ.
Intermediate TM Score (after TMD)) TM-score = 0.8989 (d0= 7.54) ----------------- Final TM score (after CMD) TM-score = 0.9096 (d0= 7.54) -----------------
Final output file
Finally I copy the intermediate file to the final output file.
>>> output pdb: /data/fff_demo/output/7bcq_fff.pdb >>> output dcd: /data/fff_demo/output/7bcq_fff.dcd
output structure display
Shown below is the structure predicted by FFF (black) against the input density map, and compared to published structures (red). You will find that the effect of structure building is related to the local quality of the density map. For regions with weak characteristics (such as loop regions), the constraints on the structure prediction/construction process are smaller, so the difference with the published structure will be larger. For areas with strong density features, the automatically constructed structure basically agrees with the published structure.


Summary
Although there are many methods for constructing the structure of the all-atom model of cryo-EM, it is still a challenge to accurately and automatically construct the structure of the medium-resolution electron microscope density map. FFF realizes the automatic construction of protein structure by combining the three-dimensional recognition algorithm in the field of computer vision and the molecular dynamic simulation technology in the field of computational simulation, and its accuracy exceeds that of traditional methods and protein structure prediction methods. In the future, the FFF algorithm will be expanded to the structure construction of DNA/RNA/small molecules. In addition, we have developed an App (https://app.bohrium.dp.tech/fff) based on the FFF algorithm, so that more people can apply FFF to their cryo-EM data processing workflow.
References
- Garaeva, A.A., Guskov, A., Slotboom, D.J. et al. A one-gate elevator mechanism for the human neutral amino acid transporter ASCT2. Nat Commun 10, 3427 (2019)
- Garibsingh RA, Ndaru E, Garaeva AA, Shi Y, Zielewicz L, Zakrepine P, Bonomi M, Slotboom DJ, Paulino C, Grewer C, Schlessinger A. Rational design of ASCT2 inhibitors using an integrated experimental-computational approach. Proc. Natl. Acad. Sci. (U. S. A.) 118:e2104093118. (2021)
- Jumper, J., Evans, R., Pritzel, A. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021)
- Trabuco LG, Villa E, Mitra K, Frank J, Schulten K. Flexible fitting of atomic structures into electron microscopy maps using molecular dynamics. Structure. 16:673-83 (2008)
- Weijie Chen, Xinyan Wang, and Yuhang Wang. FFF: Fragment-Guided Flexible Fitting for Building Complete Protein Structures. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2023 pp. 19776-19785 (2023)



