3D Reconstruction of Protein Complex Structures Using Synthesized Multi-View AFM Images

Jaydeep Rade,Soumik Sarkar,Anwesha Sarkar,Adarsh Krishnamurthy
DOI: https://doi.org/10.48550/arXiv.2211.14662
2022-11-27
Abstract:Recent developments in deep learning-based methods demonstrated its potential to predict the 3D protein structures using inputs such as protein sequences, Cryo-Electron microscopy (Cryo-EM) images of proteins, etc. However, these methods struggle to predict the protein complexes (PC), structures with more than one protein. In this work, we explore the atomic force microscope (AFM) assisted deep learning-based methods to predict the 3D structure of PCs. The images produced by AFM capture the protein structure in different and random orientations. These multi-view images can help train the neural network to predict the 3D structure of protein complexes. However, obtaining the dataset of actual AFM images is time-consuming and not a pragmatic task. We propose a virtual AFM imaging pipeline that takes a 'PDB' protein file and generates multi-view 2D virtual AFM images using volume rendering techniques. With this, we created a dataset of around 8K proteins. We train a neural network for 3D reconstruction called Pix2Vox++ using the synthesized multi-view 2D AFM images dataset. We compare the predicted structure obtained using a different number of views and get the intersection over union (IoU) value of 0.92 on the training dataset and 0.52 on the validation dataset. We believe this approach will lead to better prediction of the structure of protein complexes.
Computer Vision and Pattern Recognition,Machine Learning,Image and Video Processing,Quantitative Methods
What problem does this paper attempt to address?
The problem that this paper attempts to solve is **predicting the 3D structure of protein complexes (PC)**. Current methods have made remarkable progress in predicting the 3D structure of a single protein, but these methods perform poorly when it comes to predicting the structure of complexes composed of multiple proteins. In particular, even advanced models such as AlphaFold2 have difficulties in predicting the structures of certain protein complexes (such as the WRC protein complex). To solve this problem, the author proposes a deep - learning method based on atomic force microscopy (AFM) images. AFM can capture structural images of proteins in different random directions, and these multi - view images are helpful for training a neural network to predict the 3D structure of protein complexes. However, actually obtaining an AFM image data set is a time - consuming and impractical task. Therefore, the author has developed a virtual AFM imaging pipeline, which can generate multi - view 2D virtual AFM images from PDB files, thus creating a data set containing about 8,000 protein samples. By using this data set, the author has trained a neural network named Pix2Vox++ to reconstruct the 3D structure of protein complexes from multi - view AFM images. The experimental results show that this method exhibits high accuracy on the training set (the IoU value is 0.92), but its performance on the validation set decreases (the IoU value is 0.52), indicating that the model may have an over - fitting problem. Nevertheless, the author believes that this method is expected to better predict the structure of protein complexes in the future.