CryoJAM: Automating Protein Homolog Fitting in Medium Resolution Cryo-EM Density Maps

Jackson Thomas Carrion,Mrunali Manjrekar,Anna Mikulevica
DOI: https://doi.org/10.1101/2024.07.10.602952
2024-07-11
Abstract:Obtaining atomic structures of large protein complexes from medium-resolution cryogenic electron-microscopy (cryo-EM) density maps is a critical bottleneck in the cryo-EM workflow. CryoJAM aims to automate this process by using a 3D Convolutional Neural Network model within a U-Net architecture. This model is trained on a novel loss function that leverages Fourier-Shell Correlation (FSC), as a proxy for quality of fit, and Root Mean Squared Error (RMSE) to help optimize fits within real space. Capitalizing on the gold-standard status of FSC in cryo-EM, this method introduces an innovative implementation of FSC into cryo-EM model fitting software, enhancing the precision and efficiency of structural analysis. After 25 epochs, CryoJAM successfully reduced the RMSE in 21 out of 26 of the test cases, effectively fitting homologous protein structures into medium-resolution cryo-EM densities.
Biophysics
What problem does this paper attempt to address?
The paper attempts to address the problem of automatically fitting protein homologous structures into medium-resolution cryo-electron microscopy (cryo-EM) density maps. Specifically, the paper points out that obtaining atomic structures of large protein complexes from medium-resolution density maps is a critical bottleneck in the cryo-EM workflow. Current methods, such as Molecular Dynamics Flexible Fitting (MDFF) or combining advanced protein structure prediction models (like AlphaFold), are computationally expensive and impractical when dealing with large complexes greater than 500 kDa. Therefore, researchers urgently need an innovative computational strategy that can efficiently and accurately integrate atomic-scale structural data into large medium-resolution density maps without extensive manual intervention. To address this challenge, the paper introduces a deep learning model named CryoJAM. CryoJAM uses 3D Convolutional Neural Networks (CNN) and U-Net architecture, and introduces a new loss function that combines Fourier Shell Correlation (FSC) and Root Mean Square Error (RMSE). In this way, CryoJAM aims to automate and improve the accuracy and efficiency of fitting protein homologous structures into cryo-EM density maps. Experimental results show that after 25 training epochs, CryoJAM successfully reduced RMSE in 21 out of 26 test cases, effectively fitting homologous protein structures into medium-resolution cryo-EM density maps.