MemBrain: A Deep Learning-aided Pipeline for Automated Detection of Membrane Proteins in Cryo-electron Tomograms

Lorenz Lamm,Ricardo D. Righetto,Wojciech Wietrzynski,Matthias Pöge,Antonio Martinez-Sanchez,Tingying Peng,Benjamin D. Engel
DOI: https://doi.org/10.1101/2022.03.01.480844
2022-03-01
Abstract:Abstract Background and Objective Cryo-electron tomography (cryo-ET) is an imaging technique that enables 3D visualization of the native cellular environment at sub-nanometer resolution, providing unpreceded insights into the molecular organization of cells. However, cryo-electron tomograms suffer from low signal-to-noise ratios and anisotropic resolution, which makes subsequent image analysis challenging. In particular, the automated detection of membrane-embedded proteins is a problem still lacking satisfactory solutions. Methods We present MemBrain – a new deep learning-based pipeline that automatically detects membrane-bound protein complexes in cryo-electron tomograms. After subvolumes are sampled along a segmented membrane, each subvolume is assigned a score using a convolutional neural network (CNN), and protein positions are extracted by a clustering algorithm. Incorporating rotational subvolume normalization and using a tiny receptive field simplify the task of protein detection and thus facilitate the network training. Results MemBrain requires only a small quantity of training labels and achieves excellent performance with only a single annotated membrane (F1 score: 0.88). A detailed evaluation shows that our fully trained pipeline outperforms existing classical computer vision-based and CNN-based approaches by a large margin (F1 score: 0.92 vs. max. 0.63). Furthermore, in addition to protein center positions, MemBrain can determine protein orientations, which has not been implemented by any existing CNN-based method to date. We also show that a pre-trained MemBrain program generalizes to tomograms acquired using different cryo-ET methods and depicting different types of cells. Conclusions MemBrain is a powerful and label-efficient tool for the detection of membrane protein complexes in cryo-ET data, with the potential to be used in a wide range of biological studies. It is generalizable to various kinds of tomograms, making it possible to use pretrained models for different tasks. Its efficiency in terms of required annotations also allows rapid training and fine-tuning of models. The corresponding code, pretrained models, and instructions for operating the MemBrain program can be found at: https://github.com/CellArchLab/MemBrain
What problem does this paper attempt to address?