©️ Copyright 2023 @ Authors
作者:
杨舒文 📨
李子尧 📨
日期:2023-07-13
共享协议:本作品采用知识共享署名-非商业性使用-相同方式共享 4.0 国际许可协议进行许可。
快速开始:点击上方的 开始连接 按钮,选择 unifold-notebook:v2镜像 和任意配置机型即可开始。
Uni-Fold Notebook
This notebook provides protein structure prediction service of Uni-Fold as well as UF-Symmetry. Predictions of both protein monomers and multimers are supported. The homology search process in this notebook is enabled with the MMSeqs2 server provided by ColabFold. For more consistent results with the original AlphaFold(-Multimer), please refer to the open-source repository of Uni-Fold, or our convenient web server at Hermite™.
Please note that this notebook is provided as an early-access prototype, and is NOT an official product of DP Technology. It is provided for theoretical modeling only and caution should be exercised in its use.
Licenses
This Colab uses the Uni-Fold model parameters and its outputs are under the terms of the Creative Commons Attribution 4.0 International (CC BY 4.0) license. You can find details at: https://creativecommons.org/licenses/by/4.0/legalcode. The Colab itself is provided under the Apache 2.0 license.
Citations
Please cite the following papers if you use this notebook:
- Ziyao Li, Xuyang Liu, Weijie Chen, Fan Shen, Hangrui Bi, Guolin Ke, Linfeng Zhang. "Uni-Fold: An Open-Source Platform for Developing Protein Folding Models beyond AlphaFold." biorxiv (2022)
- Ziyao Li, Shuwen Yang, Xuyang Liu, Weijie Chen, Han Wen, Fan Shen, Guolin Ke, Linfeng Zhang. "Uni-Fold Symmetry: Harnessing Symmetry in Folding Large Protein Complexes." bioRxiv (2022)
- Mirdita M, Schütze K, Moriwaki Y, Heo L, Ovchinnikov S and Steinegger M. "ColabFold: Making protein folding accessible to all." Nature Methods (2022)
Acknowledgements
The model architecture of Uni-Fold is largely based on AlphaFold and AlphaFold-Multimer. The design of this notebook refers directly to ColabFold. We specially thank @sokrypton for his helpful suggestions to this notebook.
Copyright © 2022 DP Technology. All rights reserved.
1. CONFIGURATION
1.1 Input and Output
Set up input contents (from file or directly filling input_json
) and output path.
jobname (str)
: name of the job, served as prefix of output directories.input_json_path (str)
: path of input json file, which contains a list or dict of proteins. If it's a list, we take indices as IDs. Each protein is a dict with keys:symmetry
: protein's symmetry group. Use "C1" as default.sequence
: the sequences of the asymmetric unit (splitted by ";").
output_dir_base (str)
: root directory of output files.
For multimers, it's recommended to specify a cyclic symmetry group (e.g. C4
) and the sequences of the asymmetric unit (i.e. do not copy them multiple times) to predict with UF-Symmetry.
1.2 Hyper-parameters
Setup inference parameters:
use_templates (bool)
: whether to use template features.msa_mode (str)
: set to "MMseqs2" if requiring MSA features, "single_sequence" if not.max_recycling_iters (int)
: max recycling iterations.num_ensembles (int)
: number of ensembles.manual_seed (int)
: seed.times (int)
: number of Uni-Fold inference attempts.max_display_cnt
: max number of displayed proteins in visualization stage.
1.3 Data-processing Functions
The following block is recommended to be folded, if possible.
2. Inference
2.1 Data Input
Input protein sequence(s).
Number of input proteins: 4 Using the single-chain model. Using UF-Symmetry with group C2. If you do not want to use UF-Symmetry, please use `C1` and copy the AU sequences to the count in the assembly. Using UF-Symmetry with group C2. If you do not want to use UF-Symmetry, please use `C1` and copy the AU sequences to the count in the assembly. Using UF-Symmetry with group C3. If you do not want to use UF-Symmetry, please use `C1` and copy the AU sequences to the count in the assembly.
2.2 Feature Generation
Process features for Uni-Fold prediction.
COMPLETE: 100%|██████████| 150/150 [elapsed: 00:06 remaining: 00:00] WARNING:absl:The exact sequence DPTQFEERHLKFLQQLGKGNFGSVEMCRYDPLQDGEVVAVKKLQHSTEEHLRDFEREIEILKSLQHDNIVKYKGVCYSAGRRNLKLIMEYLPYGSLRDYLQKHKERIDHIKLLQYTSQICKGMEYLGTKRYIHRDLATRNILVENENRVKIGDFGLTKVLPQDKEYYKVKEPGESPIFWYAPESLTESKFSVASDVWSFGVVLYELFTYIEKSKSPPAEFMRMIGNDKQGQMIVFHLIELLKNNGRLPRPDGCPDEIYMIMTECWNNNVNQRPSFRDLALRVDQIRDNMAG was not found in 7ll5_A. Realigning the template to the actual sequence. WARNING:absl:The exact sequence GDPTQFEERHLKFLQQLGKGNFGSVEMCRYDPLQDNTGEVVAVKKLQHSTEEHLRDFEREIEILKSLQHDNIVKYKGVCYSGRRNLKLIMEYLPYGSLRDYLQKHKERIDHIKLLQYTSQICKGMEYLGTKRYIHRDLATRNILVENENRVKIGDFGLTKVLPQDKEYYKVKEPGESPIFWYAPESLTESKFSVASDVWSFGVVLYELFTYIEKSKSPPAEFMRMIGNDKQGQMIVFHLIELLKNNGRLPRPDGCPDEIYMIMTECWNNNVNQRPSFRDLALRVDQIRDNMAG was not found in 6vnk_A. Realigning the template to the actual sequence. WARNING:absl:The exact sequence PTQFEERHLKFLQQLGKGNFGSVEMCRYDPLQDNTGEVVAVKKLQHSTEEHLRDFEREIEILKSLQHDNIVKYKGVCYSAGRNLKLIMEYLPYGSLRDYLQKHKERIDHIKLLQYTSQICKGMEYLGTKRYIHRDLATRNILVENENRVKIGDFGLTKVLPQDKEYYKVKEPGESPIFWYAPESLTESKFSVASDVWSFGVVLYELFTYIEKSKSPPAEFMRMIGNDKQGQMIVFHLIELLKNNGRLPRPDGCPDEIYMIMTECWNNNVNQRPSFRDLALRVDQIRDNMAG was not found in 5cf4_B. Realigning the template to the actual sequence. WARNING:absl:The exact sequence DPTQFEERHLKFLQQLGKGFGSVEMCRYDPLQDNTGEVVAVKKLQHSTEEHLRDFEREIEILKSLQHDNIVKYKGVCYSAGRRNLKLIMEYLPYGSLRDYLQKHKERIDHIKLLQYTSQICKGMEYLGTKRYIHRDLATRNILVENENRVKIGDFGLTKVLPQDKEYYKVKEPGESPIFWYAPESLTESKFSVASDVWSFGVVLYELFTYIEKSKSPPAEFMRMIGNDKQGQMIVFHLIELLKNNGRLPRPDGCPDEIYMIMTECWNNNVNQRPSFRDLALRVDQIRDN was not found in 4d0w_A. Realigning the template to the actual sequence. WARNING:absl:The exact sequence FEERHLKFLQQLGKGNFGSVEMCRYDPLQDNTGEVVAVKKLQHSTEEHLRDFEREIEILKSLQHDNIVKYKGVCYNLKLIMEYLPYGSLRDYLQKHKERIDHIKLLQYTSQICKGMEYLGTKRYIHRDLATRNILVENENRVKIGDFGLTKVLPQDKEYYKVKEPGESPIFWYAPESLTESKFSVASDVWSFGVVLYELFTYIEKSKSPPAEFMRMIGNDKQGQMIVFHLIELLKNNGRLPRPDGCPDEIYMIMTECWNNNVNQRPSFRDLALRVDQIRDNMAG was not found in 5cf8_A. Realigning the template to the actual sequence. WARNING:absl:The exact sequence PTQFEERHLKFLRQLGKGNFGSVEMCRYDPLQDNTGEVVAVKKLQHSTEEHLRDFEREIEILKSLQHDNIVKYKGVCYNLKLIMEFLPYGSLREYLQKHKERIDHIKLLQYTSQICKGMEYLGTKRYIHRDLATRNILVENENRVKIGDFGLTKVLPQDKEYYKVKEPGESPIFWYAPESLTESKFSVASDVWSFGVVLYELFTYIEKSKSPPAEFMRMIGNDKQGQMIVFHLIELLKNNGRLPRPDGCPDEIYMIMTECWNNNVNQRPSFRDLALRVDQIRDNMAG was not found in 4e6d_B. Realigning the template to the actual sequence. WARNING:absl:The exact sequence EERHLKFLQQLGKGNFGSVEMCRYDPLQDNTGEVVAVKKLQHSTEEHLRDFEREIEILKSLQHDNIVKYKGVCYSNLKLIMEYLPYGSLRDYLQKHKERIDHIKLLQYTSQICKGMEYLGTKRYIHRDLATRNILVENENRVKIGDFGLTKVLPQDKEYYKVKEPGESPIFWYAPESLTESKFSVASDVWSFGVVLYELFTYIEKSKSPPAEFMRMIGNDKQGQMIVFHLIELLKNNGRLPRPDGCPDEIYMIMTECWNNNVNQRPSFRDLALRVDQIRDNMAG was not found in 3tjd_A. Realigning the template to the actual sequence. WARNING:absl:The exact sequence FEDRDPTQFEERHLKFLQQLGKGSVEMCRYDPLQDNTGEVVAVKKLQHSTEEHLRDFEREIEILKSLQHDNIVKYKGVCYSAGRRNLKLIMEYLPYGSLRDYLQKHKERIDHIKLLQYTSQICKGMEYLGTKRYIHRDLATRNILVENENRVKIGDFGLTKVLPQDKEYYKVKEPGESPIFWYAPESLTESKFSVASDVWSFGVVLYELFTYIEKSKSPPAEFMRMIGNDKQGQMIVFHLIELLKNNGRLPRPDGCPDEIYMIMTECWNNNVNQRPSFRDLALRVDQIRDN was not found in 5wev_A. Realigning the template to the actual sequence. WARNING:absl:The exact sequence DPTQFEERHLKFLQQLGKGNFGSVEMCRYDPLQDNTGEVVAVKKLQHSTEEHLRDFEREIEILKSLQHDNIVKYKGVCYSAGRRNLKLIMEYLPYGSLRDYLQKHKERIDHIKLLQYTSQICKGMEYLGTKRYIHRDLATRNILVENENRVKIGDFGLTKVLPQDKEYYKVKESPIFWYAPESLTESKFSVASDVWSFGVVLYELFTYIEKSKSPPAEFMRMIGNDKQGQMIVFHLIELLKNNGRLPRPDGCPDEIYMIMTECWNNNVNQRPSFRDLALRVDQIRDNM was not found in 5usy_B. Realigning the template to the actual sequence. WARNING:absl:The exact sequence EERHLKFLQQLGKGNFGSVEMCRYDPLQDNTGEVVAVKKLQHSTEEHLRDFEREIEILKSLQHDNIVKYKGVCYSNLKLIMEYLPYGSLRDYLQKHKERIDHIKLLQYTSQICKGMEYLGTKRYIHRDLATRNILVENENRVKIGDFGLTKVLPQDKEYYKVKEPGESPIFWYAPESLTESKFSVASDVWSFGVVLYELFTYIEKSKSPPAEFMRMIGNDKQGQMIVFHLIELLKNNGRLPRPDGCPDEIYMIMTECWNNNVNQRPSFRDLALRVDQIRDQMAG was not found in 2b7a_A. Realigning the template to the actual sequence. WARNING:absl:The exact sequence QFEERHLKFLQQLGKGNFGSVEMCRYDPLQDNTGEVVAVKKLQHSTEEHLRDFEREIEILKSLQHDNIVKYKGVCYSAGRRNLKLIMEYLPYGSLRDYLQKHKERIDHIKLLQYTSQICKGMEYLGTKRYIHRDLATRNILVENENRVKIGDFGLTKVLPQDKEYYKVSPIFWYAPESLTESKFSVASDVWSFGVVLYELFTYIEKSKSPPAEFMRMIGNDKQGQMIVFHLIELLKNNGRLPRPDGCPDEIYMIMTECWNNNVNQRPSFRDLALRVDQIRDNMAG was not found in 3rvg_A. Realigning the template to the actual sequence. COMPLETE: 100%|██████████| 150/150 [elapsed: 00:07 remaining: 00:00] WARNING:absl:The exact sequence KEISVIGVPMDLGQMRRGVDMGPSAIRYAGVIERIEEIGYDVKDMGDICIENTKLRNLTQVATVCNELASKVDHIIEEGRFPLVLGGDHSIAIGTLAGVAKHYKNLGVIWYDAHGDLNTEETSPSGNIHGMSLAASLGYGHSSLVDLYGAYPKVKKENVVIIGARALDEGEKDFIRNEGIKVFSMHEIDRMGMTAVMEETIAYLSHTDGVHLSLDLDGLDPHDAPGVGTPVIGGLSYRESHLAMEMLAEADIITSAEFVEVNTILDERNRTATTAVALMGSLFGE was not found in 6nbk_D. Realigning the template to the actual sequence. WARNING:absl:The exact sequence KEISVIGVPMDLGQMRRGVDMGPSAIRYAGVIERIEEIGYDVKDMGDICINTKLRNLTQVATVCNELASKVDHIIEEGRFPLVLGGDHSIAIGTLAGVAKHYKNLGVIWYDAHGDLNTEETSPSGNIHGMSLAASLGYGHSSLVDLYGAYPKVKKENVVIIGARALDEGEKDFIRNEGIKVFSMHEIDRMGMTAVMEETIAYLSHTDGVHLSLDLDGLDPHDAPGVGTPVIGGLSYRESHLAMEMLAEADIITSAEFVEVNTILDERNRTATTAVALMGSLFGE was not found in 6nbk_C. Realigning the template to the actual sequence. WARNING:absl:The exact sequence KEISVIGVPMDLGQMRRGVDMGPSAIRYAGVIERIEEIGYDVKDMGDICIEENTKLRNLTQVATVCNELASKVDHIIEEGRFPLVLGGDHSIAIGTLAGVAKHYKNLGVIWYDAHGDLNTEETSPSGNIHGMSLAASLGYGHSSLVDLYGAYPKVKKENVVIIGARALDEGEKDFIRNEGIKVFSMHEIDRMGMTAVMEETIAYLSHTDGVHLSLDLDGLDPHDAPGVGTPVIGGLSYRESHLAMEMLAEADIITSAEFVEVNTILDERNRTATTAVALMGSLFGE was not found in 6nbk_A. Realigning the template to the actual sequence. WARNING:absl:The exact sequence DKTISVIGMPMDLGQARRGVDMGPSAIRYAHLIERLSDMGYTVEDLGDIPINELKNLNSVLAGNEKLAQKVNKVIEEKKFPLVLGGDHSIAIGTLAGTAKHYDNLGVIWYDAHGDLNTLETSPSGNIHGMPLAVSLGIGHESLVNLEGYAPKIKPENVVIIGARSLDEGERKYIKESGMKVYTMHEIDRLGMTKVIEETLDYLSACDGVHLSLDLDGLDPNDAPGVGTPVVGGISYRESHLAMEMLYDAGIITSAEFVEVNPILDHKNKTGKTAVELVESLLGK was not found in 6nfp_E. Realigning the template to the actual sequence. WARNING:absl:The exact sequence DKTISVIGMPMDLGQARRGVDMGPSAIRYAHLIERLSDMGYTVEDLGDIPINREELKNLNSVLAGNEKLAQKVNKVIEEKKFPLVLGGDHSIAIGTLAGTAKHYDNLGVIWYDAHGDLNTLETSPSGNIHGMPLAVSLGIGHESLVNLEGYAPKIKPENVVIIGARSLDEGERKYIKESGMKVYTMHEIDRLGMTKVIEETLDYLSACDGVHLSLDLDGLDPNDAPGVGTPVVGGISYRESHLAMEMLYDAGIITSAEFVEVNPILDHKNKTGKTAVELVESLLGK was not found in 6nfp_F. Realigning the template to the actual sequence. WARNING:absl:The exact sequence KTISVIGMPMDLGQARRGVDMGPSAIRYAHLIERLSDMGYTVEDLGDIPINREDEELKNLNSVLAGNEKLAQKVNKVIEEKKFPLVLGGDHSIAIGTLAGTAKHYDNLGVIWYDAHGDLNTLETSPSGNIHGMPLAVSLGIGHESLVNLEGYAPKIKPENVVIIGARSLDEGERKYIKESGMKVYTMHEIDRLGMTKVIEETLDYLSACDGVHLSLDLDGLDPNDAPGVGTPVVGGISYRESHLAMEMLYDAGIITSAEFVEVNPILDHKNKTGKTAVELVESLLGK was not found in 6nfp_C. Realigning the template to the actual sequence. WARNING:absl:The exact sequence KTISVIGMPMDLGQARRGVDMGPSAIRYAHLIERLSDMGYTVEDLGDIPINREKIDEELKNLNSVLAGNEKLAQKVNKVIEEKKFPLVLGGDHSIAIGTLAGTAKHYDNLGVIWYDAHGDLNTLETSPSGNIHGMPLAVSLGIGHESLVNLEGYAPKIKPENVVIIGARSLDEGERKYIKESGMKVYTMHEIDRLGMTKVIEETLDYLSACDGVHLSLDLDGLDPNDAPGVGTPVVGGISYRESHLAMEMLYDAGIITSAEFVEVNPILDHKNKTGKTAVELVESLLGKK was not found in 6nfp_A. Realigning the template to the actual sequence. WARNING:absl:The exact sequence KTISVIGMPMDLGQARRGVDMGPSAIRYAHLIERLSDMGYTVEDLGDIPINNLNSVLAGNEKLAQKVNKVIEEKKFPLVLGGDHSIAIGTLAGTAKHYDNLGVIWYDAHGDLNTLETSPSGNIHGMPLAVSLGIGHESLVNLEGYAPKIKPENVVIIGARSLDEGERKYIKESGMKVYTMHEIDRLGMTKVIEETLDYLSACDGVHLSLDLDGLDPNDAPGVGTPVVGGISYRESHLAMEMLYDAGIITSAEFVEVNPILDHKNKTGKTAVELVESLLGK was not found in 6dkt_D. Realigning the template to the actual sequence. WARNING:absl:The exact sequence KTISVIGMPMDLGQARRGVDMGPSAIRYAHLIERLSDMGYTVEDLGDIPINNLNSVLAGNEKLAQKVNKVIEEKKFPLVLGGDHSIAIGTLAGTAKHYDNLGVIWYDAHGDLNTLESGNIHGMPLAVSLGIGHESLVNLEGYAPKIKPENVVIIGARSLDEGERKYIKESGMKVYTMHEIDRLGMTKVIEETLDYLSACDGVHLSLDLDGLDPNDAPGVGTPVVGGISYRESHLAMEMLYDAGIITSAEFVEVNPILDHKNKTGKTAVELVESLLGK was not found in 6dkt_F. Realigning the template to the actual sequence. WARNING:absl:The exact sequence RVAVVGVPMDLGANRRGVDMGPSALRYARLLEQLEDLGYTVEDLGDVPVSLARLAYLEEIRAAALVLKERLAALPEGVFPIVLGGDHSLSMGSVAGAARGRRVGVVWVDAHADFNTPETSPSGNVHGMPLAVLSGLGHPRLTEVFRAVDPKDVVLVGVRSLDPGEKRLLKEAGVRVYTMHEVDRLGVARIAEEVLKHLQGLPLHVSLDADVLDPTLAPGVGTPVPGGLTYREAHLLMEILAESGRVQSLDLVEVNPILDERNRTAEMLVGLALSLLGKR was not found in 2ef4_A. Realigning the template to the actual sequence. WARNING:absl:The exact sequence RVAVVGVPMDLGVDMGPSALRYARLLEQLEDLGYTVEDLGDVPVSLAYLEEIRAAALVLKERLAALPEGVFPIVLGGDHSLSMGSVAGAARGRRVGVVWVDAHADFNTPETSSGNVHGMPLAVLSGLGHPRLTEVFRAVDPKDVVLVGVRSLDPGEKRLLKEAGVRVYTMHEVDRLGVARIAEEVLKHLQGLPLHVSLDADVLDPTLAPGVGTPVPGGLTYREAHLLMEILAESGRVQSLDLVEVNPILDERNRTAEMLVGLALSLLGKR was not found in 2eiv_M. Realigning the template to the actual sequence. COMPLETE: 100%|██████████| 300/300 [elapsed: 00:05 remaining: 00:00] COMPLETE: 100%|██████████| 300/300 [elapsed: 00:05 remaining: 00:00] COMPLETE: 100%|██████████| 150/150 [elapsed: 00:06 remaining: 00:00] WARNING:absl:The exact sequence CLAEGTRIFDPVTGTTHRIEDVVDGRKPIHVVAAAKDGTLHARPVVSWFDQGTRDVIGLRIAGGAILWATPDHKVLTEYGWRAAGELRKGDRVAQPRRFDGFMLAEELRYSVIREVLPTRRARTFDLEVEELHTLVAEGVVVH was not found in 2imz_A. Realigning the template to the actual sequence. WARNING:absl:The exact sequence CLAEGTRIFDPVTGTTHRIEDVVDGRKPIHVVAAAKDGTLHARPVVSWFDQGTRDVIGLRIAGGAILWATPDHKVLTEYGWRAAGELRKGDRVAQPRRFDGFEELRYSVIREVLPTRRARTFDLEVEELHTLVAEGVVVH was not found in 2imz_B. Realigning the template to the actual sequence.
2.3 Model Prediction
Uni-Fold prediction
start to load params /root/params/monomer.unifold.pt start to predict unifold_bohrium_0 {'aatype': torch.Size([1, 1, 317]), 'residue_index': torch.Size([1, 1, 317]), 'seq_length': torch.Size([1, 1]), 'msa_chains': torch.Size([4, 1, 508, 1]), 'template_aatype': torch.Size([1, 1, 4, 317]), 'template_all_atom_mask': torch.Size([1, 1, 4, 317, 37]), 'template_all_atom_positions': torch.Size([1, 1, 4, 317, 37, 3]), 'bert_mask': torch.Size([4, 1, 508, 317]), 'msa_mask': torch.Size([4, 1, 508, 317]), 'num_recycling_iters': torch.Size([1, 1]), 'is_distillation': torch.Size([4, 1]), 'seq_mask': torch.Size([1, 1, 317]), 'msa_row_mask': torch.Size([4, 1, 508]), 'template_mask': torch.Size([1, 1, 4]), 'template_pseudo_beta': torch.Size([1, 1, 4, 317, 3]), 'template_pseudo_beta_mask': torch.Size([1, 1, 4, 317]), 'template_torsion_angles_sin_cos': torch.Size([1, 1, 4, 317, 7, 2]), 'template_alt_torsion_angles_sin_cos': torch.Size([1, 1, 4, 317, 7, 2]), 'template_torsion_angles_mask': torch.Size([1, 1, 4, 317, 7]), 'residx_atom14_to_atom37': torch.Size([1, 1, 317, 14]), 'residx_atom37_to_atom14': torch.Size([1, 1, 317, 37]), 'atom14_atom_exists': torch.Size([1, 1, 317, 14]), 'atom37_atom_exists': torch.Size([1, 1, 317, 37]), 'target_feat': torch.Size([1, 1, 317, 22]), 'extra_msa': torch.Size([4, 1, 1024, 317]), 'extra_msa_mask': torch.Size([4, 1, 1024, 317]), 'extra_msa_row_mask': torch.Size([4, 1, 1024]), 'true_msa': torch.Size([4, 1, 508, 317]), 'extra_msa_has_deletion': torch.Size([4, 1, 1024, 317]), 'extra_msa_deletion_value': torch.Size([4, 1, 1024, 317]), 'msa_feat': torch.Size([4, 1, 508, 317, 49])} Inference time: 29.34663464399995 plddts {'monomer.unifold.pt_97923': '0.914682'} start to load params /root/params/uf_symmetry.pt start to predict unifold_bohrium_1 {'aatype': torch.Size([1, 1, 287]), 'residue_index': torch.Size([1, 1, 287]), 'seq_length': torch.Size([1, 1]), 'msa_chains': torch.Size([4, 1, 252, 1]), 'template_aatype': torch.Size([1, 1, 4, 287]), 'template_all_atom_mask': torch.Size([1, 1, 4, 287, 37]), 'template_all_atom_positions': torch.Size([1, 1, 4, 287, 37, 3]), 'asym_id': torch.Size([1, 1, 287]), 'sym_id': torch.Size([1, 1, 287]), 'entity_id': torch.Size([1, 1, 287]), 'num_sym': torch.Size([1, 1, 287]), 'assembly_num_chains': torch.Size([1, 1, 1]), 'cluster_bias_mask': torch.Size([1, 1, 252]), 'bert_mask': torch.Size([4, 1, 252, 287]), 'msa_mask': torch.Size([4, 1, 252, 287]), 'asym_len': torch.Size([1, 1, 1]), 'num_recycling_iters': torch.Size([1, 1]), 'is_distillation': torch.Size([4, 1]), 'seq_mask': torch.Size([1, 1, 287]), 'msa_row_mask': torch.Size([4, 1, 252]), 'template_mask': torch.Size([1, 1, 4]), 'template_pseudo_beta': torch.Size([1, 1, 4, 287, 3]), 'template_pseudo_beta_mask': torch.Size([1, 1, 4, 287]), 'template_torsion_angles_sin_cos': torch.Size([1, 1, 4, 287, 7, 2]), 'template_alt_torsion_angles_sin_cos': torch.Size([1, 1, 4, 287, 7, 2]), 'template_torsion_angles_mask': torch.Size([1, 1, 4, 287, 7]), 'residx_atom14_to_atom37': torch.Size([1, 1, 287, 14]), 'residx_atom37_to_atom14': torch.Size([1, 1, 287, 37]), 'atom14_atom_exists': torch.Size([1, 1, 287, 14]), 'atom37_atom_exists': torch.Size([1, 1, 287, 37]), 'target_feat': torch.Size([1, 1, 287, 22]), 'extra_msa': torch.Size([4, 1, 1152, 287]), 'extra_msa_mask': torch.Size([4, 1, 1152, 287]), 'extra_msa_row_mask': torch.Size([4, 1, 1152]), 'true_msa': torch.Size([4, 1, 252, 287]), 'msa_feat': torch.Size([4, 1, 252, 287, 49]), 'extra_msa_has_deletion': torch.Size([4, 1, 1152, 287]), 'extra_msa_deletion_value': torch.Size([4, 1, 1152, 287]), 'symmetry_opers': torch.Size([1, 1, 2, 4, 4]), 'pseudo_residue_feat': torch.Size([1, 1, 8]), 'num_asym': torch.Size([1, 1])} Inference time: 15.05725186899997 plddts {'uf_symmetry.pt_97923': '0.93517303'} start to load params /root/params/uf_symmetry.pt start to predict unifold_bohrium_2 {'aatype': torch.Size([1, 1, 212]), 'residue_index': torch.Size([1, 1, 212]), 'seq_length': torch.Size([1, 1]), 'msa_chains': torch.Size([4, 1, 252, 1]), 'template_aatype': torch.Size([1, 1, 4, 212]), 'template_all_atom_mask': torch.Size([1, 1, 4, 212, 37]), 'template_all_atom_positions': torch.Size([1, 1, 4, 212, 37, 3]), 'asym_id': torch.Size([1, 1, 212]), 'sym_id': torch.Size([1, 1, 212]), 'entity_id': torch.Size([1, 1, 212]), 'num_sym': torch.Size([1, 1, 212]), 'assembly_num_chains': torch.Size([1, 1, 1]), 'cluster_bias_mask': torch.Size([1, 1, 252]), 'bert_mask': torch.Size([4, 1, 252, 212]), 'msa_mask': torch.Size([4, 1, 252, 212]), 'asym_len': torch.Size([1, 1, 2]), 'num_recycling_iters': torch.Size([1, 1]), 'is_distillation': torch.Size([4, 1]), 'seq_mask': torch.Size([1, 1, 212]), 'msa_row_mask': torch.Size([4, 1, 252]), 'template_mask': torch.Size([1, 1, 4]), 'template_pseudo_beta': torch.Size([1, 1, 4, 212, 3]), 'template_pseudo_beta_mask': torch.Size([1, 1, 4, 212]), 'template_torsion_angles_sin_cos': torch.Size([1, 1, 4, 212, 7, 2]), 'template_alt_torsion_angles_sin_cos': torch.Size([1, 1, 4, 212, 7, 2]), 'template_torsion_angles_mask': torch.Size([1, 1, 4, 212, 7]), 'residx_atom14_to_atom37': torch.Size([1, 1, 212, 14]), 'residx_atom37_to_atom14': torch.Size([1, 1, 212, 37]), 'atom14_atom_exists': torch.Size([1, 1, 212, 14]), 'atom37_atom_exists': torch.Size([1, 1, 212, 37]), 'target_feat': torch.Size([1, 1, 212, 22]), 'extra_msa': torch.Size([4, 1, 336, 212]), 'extra_msa_mask': torch.Size([4, 1, 336, 212]), 'extra_msa_row_mask': torch.Size([4, 1, 336]), 'true_msa': torch.Size([4, 1, 252, 212]), 'msa_feat': torch.Size([4, 1, 252, 212, 49]), 'extra_msa_has_deletion': torch.Size([4, 1, 336, 212]), 'extra_msa_deletion_value': torch.Size([4, 1, 336, 212]), 'symmetry_opers': torch.Size([1, 1, 2, 4, 4]), 'pseudo_residue_feat': torch.Size([1, 1, 8]), 'num_asym': torch.Size([1, 1])} Inference time: 8.176409836000005 plddts {'uf_symmetry.pt_97923': '0.83992827'} start to load params /root/params/uf_symmetry.pt start to predict unifold_bohrium_3 {'aatype': torch.Size([1, 1, 156]), 'residue_index': torch.Size([1, 1, 156]), 'seq_length': torch.Size([1, 1]), 'msa_chains': torch.Size([4, 1, 252, 1]), 'template_aatype': torch.Size([1, 1, 4, 156]), 'template_all_atom_mask': torch.Size([1, 1, 4, 156, 37]), 'template_all_atom_positions': torch.Size([1, 1, 4, 156, 37, 3]), 'asym_id': torch.Size([1, 1, 156]), 'sym_id': torch.Size([1, 1, 156]), 'entity_id': torch.Size([1, 1, 156]), 'num_sym': torch.Size([1, 1, 156]), 'assembly_num_chains': torch.Size([1, 1, 1]), 'cluster_bias_mask': torch.Size([1, 1, 252]), 'bert_mask': torch.Size([4, 1, 252, 156]), 'msa_mask': torch.Size([4, 1, 252, 156]), 'asym_len': torch.Size([1, 1, 1]), 'num_recycling_iters': torch.Size([1, 1]), 'is_distillation': torch.Size([4, 1]), 'seq_mask': torch.Size([1, 1, 156]), 'msa_row_mask': torch.Size([4, 1, 252]), 'template_mask': torch.Size([1, 1, 4]), 'template_pseudo_beta': torch.Size([1, 1, 4, 156, 3]), 'template_pseudo_beta_mask': torch.Size([1, 1, 4, 156]), 'template_torsion_angles_sin_cos': torch.Size([1, 1, 4, 156, 7, 2]), 'template_alt_torsion_angles_sin_cos': torch.Size([1, 1, 4, 156, 7, 2]), 'template_torsion_angles_mask': torch.Size([1, 1, 4, 156, 7]), 'residx_atom14_to_atom37': torch.Size([1, 1, 156, 14]), 'residx_atom37_to_atom14': torch.Size([1, 1, 156, 37]), 'atom14_atom_exists': torch.Size([1, 1, 156, 14]), 'atom37_atom_exists': torch.Size([1, 1, 156, 37]), 'target_feat': torch.Size([1, 1, 156, 22]), 'extra_msa': torch.Size([4, 1, 1152, 156]), 'extra_msa_mask': torch.Size([4, 1, 1152, 156]), 'extra_msa_row_mask': torch.Size([4, 1, 1152]), 'true_msa': torch.Size([4, 1, 252, 156]), 'msa_feat': torch.Size([4, 1, 252, 156, 49]), 'extra_msa_has_deletion': torch.Size([4, 1, 1152, 156]), 'extra_msa_deletion_value': torch.Size([4, 1, 1152, 156]), 'symmetry_opers': torch.Size([1, 1, 3, 4, 4]), 'pseudo_residue_feat': torch.Size([1, 1, 8]), 'num_asym': torch.Size([1, 1])} Inference time: 5.369833519999986 plddts {'uf_symmetry.pt_97923': '0.9115123'}
3. Visualization
Visualize the structure and lDDT of Uni-Fold output.
Construct multiclass b-factors to indicate confidence bands
- 0=very low, 1=low, 2=confident, 3=very high
- Color bands for visualizing plddt
<Figure size 432x288 with 0 Axes>
<Figure size 432x288 with 0 Axes>
<Figure size 432x288 with 0 Axes>