CryoTransformer: a transformer model for picking protein particles from Cryo-EM micrographs

Ashwin Dhakal,Rajan Gyawali,Liguo Wang,Jianlin Cheng
DOI: https://doi.org/10.1093/bioinformatics/btae109
IF: 5.8
2024-02-24
Bioinformatics
Abstract:Abstract Motivation Cryo-electron microscopy (cryo-EM) is a powerful technique for determining the structures of large protein complexes. Picking single protein particles from cryo-EM micrographs (images) is a crucial step in reconstructing protein structures from them. However, the widely used template-based particle picking process requires some manual particle picking and is labor-intensive and time-consuming. Though machine learning and artificial intelligence (AI) can potentially automate particle picking, the current AI methods pick particles with low precision or low recall. The erroneously picked particles can severely reduce the quality of reconstructed protein structures, especially for the micrographs with low signal-to-noise-ratio (SNR). Results To address these shortcomings, we devised CryoTransformer based on transformers, residual networks, and image processing techniques to accurately pick protein particles from cryo-EM micrographs. CryoTransformer was trained and tested on the largest labelled cryo-EM protein particle dataset—CryoPPP. It outperforms the current state-of-the-art machine learning methods of particle picking in terms of the resolution of 3D density maps reconstructed from the picked particles as well as F1-score, and is poised to facilitate the automation of the cryo-EM protein particle picking. Availability The source code and data for CryoTransformer are openly available at: https://github.com/jianlin-cheng/CryoTransformer. Supplementary information Supplementary data are available at Bioinformatics online.
biochemical research methods,biotechnology & applied microbiology,mathematical & computational biology
What problem does this paper attempt to address?
The paper attempts to address the problem of accurately identifying protein particles in cryo-electron microscopy (cryo-EM) images. Traditional template-based methods require manual particle picking, which is both time-consuming and labor-intensive, and existing AI methods have low precision or recall rates in particle picking. Incorrectly picked particles can severely affect the quality of protein structure reconstruction in low signal-to-noise ratio images. Therefore, the paper proposes a new model based on Transformer, residual networks, and image processing techniques—CryoTransformer, aimed at accurately picking protein particles from cryo-EM images. CryoTransformer was trained and tested on the largest labeled protein particle dataset, CryoPPP, and outperformed current state-of-the-art machine learning methods, showing excellent performance in 3D density map resolution and F1 score, promising to advance the automation of cryo-EM protein particle picking.