Backbone 1H, 13C and 15N resonance assignments of the 39 kDa staphylococcal hemoglobin receptor IsdH

T. Spirig,R. Clubb

DOI: https://doi.org/10.1007/s12104-011-9348-8

2011-11-19

Biomolecular NMR Assignments

Abstract:

What problem does this paper attempt to address?

Protein Subcellular Localization Prediction by Concatenation of Convolutional Blocks for Deep Features Extraction From Microscopic Images

Sonam Aggarwal,Sapna Juneja,Junaid Rashid,Deepali Gupta,Sheifali Gupta,Jungeun Kim

DOI: https://doi.org/10.1109/access.2022.3232564

IF: 3.9

2023-01-10

IEEE Access

Abstract:Understanding where proteins are located within the cells is essential for proteomics research. Knowledge of protein subcellular location aids in early disease detection and drug targeting treatments. Incorrect localization of proteins can interfere with the functioning of cells and leads to illnesses like cancer. Technological advances have enabled computational methods to detect protein's subcellular location in living organisms. The advent of high-quality microscopy has led to the development of image-based prediction algorithms for protein subcellular localization. Confocal microscopy, which is used by the Human Protein Atlas (HPA), is a great tool for locating proteins. HPA database comprises millions of images which have been procured using confocal microscopy and are annotated with single as well as multi-labels. However, the multi-instance nature of the classification task and the low quality of the images make image-based prediction an extremely difficult problem. There are probably just a few algorithms for automatically predicting protein localization, and most of them are limited to single-label classification. Therefore, it is important to develop a satisfactory automatic multi-label HPA recognition system. The aim of this research is to design a model based on deep learning for automatic recognition system for classifying multi-label HPA. Specifically, a novel Convolutional Neural Network design for classifying protein distribution across 28 subcellular compartments has been presented in this paper. Extensive experiments have been done on the proposed model to achieve the best results for multilabel classification. With the proposed CNN framework as F1-score of 0.77 was achieved which outperformed the latest approaches.
Extracting Cellular Location of Human Proteins Using Deep Learning

Hanke Chen

DOI: https://doi.org/10.48550/arXiv.2006.03800

2020-06-06

Computer Vision and Pattern Recognition

Abstract:Understanding and extracting the patterns of microscopy images has been a major challenge in the biomedical field. Although trained scientists can locate the proteins of interest within a human cell, this procedure is not efficient and accurate enough to process a large amount of data and it often leads to bias. To resolve this problem, we attempted to create an automatic image classifier using Machine Learning to locate human proteins with higher speed and accuracy than human beings. We implemented a Convolution Neural Network with Residue and Squeeze-Excitation layers classifier to locate given proteins of any type in a subcellular structure. After training the model using a series of techniques, it can locate thousands of proteins in 27 different human cell types into 28 subcellular locations, way significant than historical approaches. The model can classify 4,500 images per minute with an accuracy of 63.07%, surpassing human performance in accuracy (by 35%) and speed. Because our system can be implemented on different cell types, it opens a new vision of understanding in the biomedical field. From the locational information of the human proteins, doctors can easily detect cell's abnormal behaviors including viral infection, pathogen invasion, and malignant tumor development. Given the amount of data generalized by experiments are greater than that human can analyze, the model cut down the human resources and time needed to analyze data. Moreover, this locational information can be used in different scenarios like subcellular engineering, medical care, and etiology inspection.
Imbalanced multi-modal multi-label learning for subcellular localization prediction of human proteins with both single and multiple sites

Jianjun He,Hong Gu,Wenqi Liu

DOI: https://doi.org/10.1371/journal.pone.0037155

IF: 3.7

PLoS ONE

Abstract:It is well known that an important step toward understanding the functions of a protein is to determine its subcellular location. Although numerous prediction algorithms have been developed, most of them typically focused on the proteins with only one location. In recent years, researchers have begun to pay attention to the subcellular localization prediction of the proteins with multiple sites. However, almost all the existing approaches have failed to take into account the correlations among the locations caused by the proteins with multiple sites, which may be the important information for improving the prediction accuracy of the proteins with multiple sites. In this paper, a new algorithm which can effectively exploit the correlations among the locations is proposed by using gaussian process model. Besides, the algorithm also can realize optimal linear combination of various feature extraction technologies and could be robust to the imbalanced data set. Experimental results on a human protein data set show that the proposed algorithm is valid and can achieve better performance than the existing approaches.
A multiresolution approach to automated classification of protein subcellular location images

Amina Chebira,Yann Barbotin,Charles Jackson,Thomas Merryman,Gowri Srinivasa,Robert F Murphy,Jelena Kovačević

DOI: https://doi.org/10.1186/1471-2105-8-210

IF: 3.307

2007-06-19

BMC Bioinformatics

Abstract:BackgroundFluorescence microscopy is widely used to determine the subcellular location of proteins. Efforts to determine location on a proteome-wide basis create a need for automated methods to analyze the resulting images. Over the past ten years, the feasibility of using machine learning methods to recognize all major subcellular location patterns has been convincingly demonstrated, using diverse feature sets and classifiers. On a well-studied data set of 2D HeLa single-cell images, the best performance to date, 91.5%, was obtained by including a set of multiresolution features. This demonstrates the value of multiresolution approaches to this important problem.ResultsWe report here a novel approach for the classification of subcellular location patterns by classifying in multiresolution subspaces. Our system is able to work with any feature set and any classifier. It consists of multiresolution (MR) decomposition, followed by feature computation and classification in each MR subspace, yielding local decisions that are then combined into a global decision. With 26 texture features alone and a neural network classifier, we obtained an increase in accuracy on the 2D HeLa data set to 95.3%.ConclusionWe demonstrate that the space-frequency localized information in the multiresolution subspaces adds significantly to the discriminative power of the system. Moreover, we show that a vastly reduced set of features is sufficient, consisting of our novel modified Haralick texture features. Our proposed system is general, allowing for any combinations of sets of features and any combination of classifiers.

biochemical research methods,biotechnology & applied microbiology,mathematical & computational biology
PScL-DDCFPred: an ensemble deep learning-based approach for characterizing multiclass subcellular localization of human proteins from bioimage data

Matee Ullah,Fazal Hadi,Jiangning Song,Dong-Jun Yu

DOI: https://doi.org/10.1093/bioinformatics/btac432

IF: 5.8

2022-07-01

Bioinformatics

Abstract:Characterization of protein subcellular localization has become an important and long-standing task in bioinformatics and computational biology, which provides valuable information for elucidating various cellular functions of proteins and guiding drug design. Here, we develop a novel bioimage-based computational approach, termed PScL-DDCFPred, to accurately predict protein subcellular localizations in human tissues. PScL-DDCFPred first extracts multiview image features, including global and local features, as base or pure features; Next, it applies a new integrative feature selection method based on stepwise discriminant analysis and generalized discriminant analysis to identify the optimal feature sets from the extracted pure features; Finally, a classifier based on deep neural network (DNN) and deep-cascade forest (DCF) is established. Stringent ten-fold cross-validation tests on the new protein subcellular localization training dataset, constructed from the human protein atlas databank, illustrates that PScL-DDCFPred achieves a better performance than several existing state-of-the-art methods. Moreover, the independent test set further illustrates the generalization capability and superiority of PScL-DDCFPred over existing predictors. In-depth analysis shows that the excellent performance of PScL-DDCFPred can be attributed to three critical factors, namely the effective combination of the DNN and DCF models, complementarity of global and local features, and use of the optimal feature sets selected by the integrative feature selection algorithm. https://github.com/csbio-njust-edu/PScL-DDCFPred Supplementary data are available at Bioinformatics online.

biochemical research methods,biotechnology & applied microbiology,mathematical & computational biology
An Artificial Intelligence-Based Stacked Ensemble Approach for Prediction of Protein Subcellular Localization in Confocal Microscopy Images

Sonam Aggarwal,Sheifali Gupta,Deepali Gupta,Yonis Gulzar,Sapna Juneja,Ali A. Alwan,Ali Nauman

DOI: https://doi.org/10.3390/su15021695

IF: 3.9

2023-01-17

Sustainability

Abstract:Predicting subcellular protein localization has become a popular topic due to its utility in understanding disease mechanisms and developing innovative drugs. With the rapid advancement of automated microscopic imaging technology, approaches using bio-images for protein subcellular localization have gained a lot of interest. The Human Protein Atlas (HPA) project is a macro-initiative that aims to map the human proteome utilizing antibody-based proteomics and related c. Millions of images have been tagged with single or multiple labels in the HPA database. However, fewer techniques for predicting the location of proteins have been devised, with the majority of them relying on automatic single-label classification. As a result, there is a need for an automatic and sustainable system capable of multi-label classification of the HPA database. Deep learning presents a potential option for automatic labeling of protein's subcellular localization, given the vast image number generated by high-content microscopy and the fact that manual labeling is both time-consuming and error-prone. Hence, this research aims to use an ensemble technique for the improvement in the performance of existing state-of-art convolutional neural networks and pretrained models were applied; finally, a stacked ensemble-based deep learning model was presented, which delivers a more reliable and robust classifier. The F1-score, precision, and recall have been used for the evaluation of the proposed model's efficiency. In addition, a comparison of existing deep learning approaches has been conducted with respect to the proposed method. The results show the proposed ensemble strategy performed exponentially well on the multi-label classification of Human Protein Atlas images, with recall, precision, and F1-score of 0.70, 0.72, and 0.71, respectively.

environmental sciences,environmental studies,green & sustainable science & technology
Deep learning model for protein multi-label subcellular localization and function prediction based on multi-task collaborative training

Peihao Bai,Guanghui Li,Jiawei Luo,Cheng Liang

DOI: https://doi.org/10.1093/bib/bbae568

IF: 9.5

2024-11-05

Briefings in Bioinformatics

Abstract:The functional study of proteins is a critical task in modern biology, playing a pivotal role in understanding the mechanisms of pathogenesis, developing new drugs, and discovering novel drug targets. However, existing computational models for subcellular localization face significant challenges, such as reliance on known Gene Ontology (GO) annotation databases or overlooking the relationship between GO annotations and subcellular localization. To address these issues, we propose DeepMTC, an end-to-end deep learning-based multi-task collaborative training model. DeepMTC integrates the interrelationship between subcellular localization and the functional annotation of proteins, leveraging multi-task collaborative training to eliminate dependence on known GO databases. This strategy gives DeepMTC a distinct advantage in predicting newly discovered proteins without prior functional annotations. First, DeepMTC leverages pre-trained language model with high accuracy to obtain the 3D structure and sequence features of proteins. Additionally, it employs a graph transformer module to encode protein sequence features, addressing the problem of long-range dependencies in graph neural networks. Finally, DeepMTC uses a functional cross-attention mechanism to efficiently combine upstream learned functional features to perform the subcellular localization task. The experimental results demonstrate that DeepMTC outperforms state-of-the-art models in both protein function prediction and subcellular localization. Moreover, interpretability experiments revealed that DeepMTC can accurately identify the key residues and functional domains of proteins, confirming its superior performance. The code and dataset of DeepMTC are freely available at https://github.com/ghli16/DeepMTC.

biochemical research methods,mathematical & computational biology
Self-Supervised Deep Learning Encodes High-Resolution Features of Protein Subcellular Localization

Hirofumi Kobayashi,Keith C. Cheveralls,Manuel D. Leonetti,Loic A. Royer

DOI: https://doi.org/10.1101/2021.03.29.437595

2021-03-29

Abstract:Abstract Elucidating the diversity and complexity of protein localization is essential to fully understand cellular architecture. Here, we present cytoself , a deep-learning approach for fully self-supervised protein localization profiling and clustering. cytoself leverages a self-supervised training scheme that does not require pre-existing knowledge, categories, or annotations. Training cytoself on images of 1,311 endogenously labeled proteins from the OpenCell database reveals a highly resolved protein localization atlas that recapitulates major scales of cellular organization, from coarse classes such as nuclear, cytoplasmic and vesicular, to the subtle localization signatures of individual protein complexes. We quantitatively validate cytoself ’s ability to cluster proteins into organelles and protein complex clusters using a clustering score, and show that cytoself attains higher scores than previous unsupervised or self-supervised approaches. Finally, to better understand the inner workings of our model, we dissect the emergent features from which our clustering is derived, interpret these features in the context of the fluorescence images, and analyze the performance contributions of the different components of our approach.
Single-cell Subcellular Protein Localisation Using Novel Ensembles of Diverse Deep Architectures

Syed Sameed Husain,Eng-Jon Ong,Dmitry Minskiy,Mikel Bober-Irizar,Amaia Irizar,Miroslaw Bober

DOI: https://doi.org/10.48550/arXiv.2205.09841

2022-09-17

Abstract:Unravelling protein distributions within individual cells is key to understanding their function and state and indispensable to developing new treatments. Here we present the Hybrid subCellular Protein Localiser (HCPL), which learns from weakly labelled data to robustly localise single-cell subcellular protein patterns. It comprises innovative DNN architectures exploiting wavelet filters and learnt parametric activations that successfully tackle drastic cell variability. HCPL features correlation-based ensembling of novel architectures that boosts performance and aids generalisation. Large-scale data annotation is made feasible by our "AI-trains-AI" approach, which determines the visual integrity of cells and emphasises reliable labels for efficient training. In the Human Protein Atlas context, we demonstrate that HCPL defines state-of-the-art in the single-cell classification of protein localisation patterns. To better understand the inner workings of HCPL and assess its biological relevance, we analyse the contributions of each system component and dissect the emergent features from which the localisation predictions are derived.

Computer Vision and Pattern Recognition
Multilabel Learning for Protein Subcellular Location Prediction

Xiao Wang,Guo-Zheng Li,Jia-Ming Liu,Rui-Wei Zhao

DOI: https://doi.org/10.1109/bibm.2011.36

2011-01-01

Abstract:Protein subcellular localization aims at predicting the location of a protein within a cell using computational methods. Knowledge of subcellular localization of proteins indicates protein functions and helps in identifying drug targets. Prediction of protein subcellular localization is an important but challenging problem, particularly when proteins may simultaneously exist at, or move between, two or more different subcellular location sites. Most of the existing protein subcellular localization methods are only used to deal with the single-location proteins. To better reflect the characteristics of multiplex proteins, we formulate prediction of subcellular localization of multiplex proteins as a multilabel learning problem. We present and compare two multilabel learning approaches, which exploit correlations between labels and leverage label-specific features, respectively, to induce a high quality prediction model. Experimental results on six protein data sets under various organisms show that our described methods achieve significantly higher performance than any of the existing methods. Among the different multilabel learning methods, we find that methods exploiting label correlations performs better than those leveraging label-specific features.
Learning Consistent Subcellular Landmarks to Quantify Changes in Multiplexed Protein Maps.

Hannah Spitzer,Scott Berry,Mark Donoghoe,Lucas Pelkmans,Fabian J. Theis

DOI: https://doi.org/10.1038/s41592-023-01894-z

IF: 48

2023-01-01

Nature Methods

Abstract:Highly multiplexed imaging holds enormous promise for understanding how spatial context shapes the activity of the genome and its products at multiple length scales. Here, we introduce a deep learning framework called CAMPA (Conditional Autoencoder for Multiplexed Pixel Analysis), which uses a conditional variational autoencoder to learn representations of molecular pixel profiles that are consistent across heterogeneous cell populations and experimental perturbations. Clustering these pixel-level representations identifies consistent subcellular landmarks, which can be quantitatively compared in terms of their size, shape, molecular composition and relative spatial organization. Using high-resolution multiplexed immunofluorescence, this reveals how subcellular organization changes upon perturbation of RNA synthesis, RNA processing or cell size, and uncovers links between the molecular composition of membraneless organelles and cell-to-cell variability in bulk RNA synthesis rates. By capturing interpretable cellular phenotypes, we anticipate that CAMPA will greatly accelerate the systematic mapping of multiscale atlases of biological organization to identify the rules by which context shapes physiology and disease.
Identify ncRNA Subcellular Localization via Graph Regularized k-Local Hyperplane Distance Nearest Neighbor Model on Multi-Kernel Learning

Haohao Zhou,Hao Wang,Jijun Tang,Yijie Ding,Fei Guo

DOI: https://doi.org/10.1109/TCBB.2021.3107621

Abstract:Non-coding RNAs (ncRNAs) are a type of RNAs which are not used to encode protein sequences. Emerging evidence shows that lots of ncRNAs may participate in many biological processes and must be widely involved in many types of cancers. Therefore, understanding their functionality is of great importance. Similar to proteins, various functions of ncRNAs relies on their subcellular localizations. Traditional high-throughput methods in wet-lab to identify subcellular localization is time-consuming and costly. In this paper, we propose a novel computational method based on multi-kernel learning to identify multi-label ncRNA subcellular localizations, via graph regularized k-local hyperplane distance nearest neighbor algorithm. First, we construct six types of sequence-based feature descriptors and select important feature vectors. Then, we build a multi-kernel learning model with Hilbert-Schmidt independence criterion (HSIC) to obtain optimal weights for vairous features. Furthermore, we propose the graph regularized k-local hyperplane distance nearest neighbor algorithm (GHKNN) as a binary classification model for detecting one kind of non-coding RNA subcellular localization. Finally, we apply One-vs-Rest strategy to decompose multi-label problem of non-coding RNA subcellular localizations. Our method achieves excellent performance on three ncRNA datasets and three human ncRNA datasets, and out-performs other outstanding machine learning methods. Comparing to existing method, our model also performs well especially on small datasets. We expect that this model will be useful for the prediction of subcellular localization and the study of important functional mechanisms of ncRNAs. Furthermore, we establish user-friendly web server (http://ncrna.lbci.net/) with the implementation of our method, which can be easily used by most experimental scientists.
Subcellular Localization Prediction by Deep N-to-1 Convolutional Neural Networks

Maryam Gillani,Gianluca Pollastri

DOI: https://doi.org/10.3390/ijms25105440

IF: 5.6

2024-05-17

International Journal of Molecular Sciences

Abstract:The subcellular location of a protein provides valuable insights to bioinformaticians in terms of drug designs and discovery, genomics, and various other aspects of medical research. Experimental methods for protein subcellular localization determination are time-consuming and expensive, whereas computational methods, if accurate, would represent a much more efficient alternative. This article introduces an ab initio protein subcellular localization predictor based on an ensemble of Deep N-to-1 Convolutional Neural Networks. Our predictor is trained and tested on strict redundancy-reduced datasets and achieves 63% accuracy for the diverse number of classes. This predictor is a step towards bridging the gap between a protein sequence and the protein's function. It can potentially provide information about protein–protein interaction to facilitate drug design and processes like vaccine production that are essential to disease prevention.

biochemistry & molecular biology,chemistry, multidisciplinary
SIFLoc: a self-supervised pre-training method for enhancing the recognition of protein subcellular localization in immunofluorescence microscopic images

Yanlun Tu,Houchao Lei,Hong-Bin Shen,Yang Yang

DOI: https://doi.org/10.1093/bib/bbab605

IF: 9.5

2022-02-12

Briefings in Bioinformatics

Abstract:Abstract With the rapid growth of high-resolution microscopy imaging data, revealing the subcellular map of human proteins has become a central task in the spatial proteome. The cell atlas of the Human Protein Atlas (HPA) provides precious resources for recognizing subcellular localization patterns at the cell level, and the large-scale annotated data enable learning via advanced deep neural networks. However, the existing predictors still suffer from the imbalanced class distribution and the lack of labeled data for minor classes. Thus, it is necessary to develop new methods for coping with these issues. We leverage the self-supervised learning protocol to address these problems. Especially, we propose a pre-training scheme to enhance the conventional supervised learning framework called SIFLoc. The pre-training is featured by a hybrid data augmentation method and a modified contrastive loss function, aiming to learn good feature representations from microscopic images. The experiments are performed on a large-scale immunofluorescence microscopic image dataset collected from the HPA database. Using the same deep neural networks as the classifier, the model pre-trained via SIFLoc not only outperforms the model without pre-training by a large margin but also shows advantages over the state-of-the-art self-supervised learning methods. Especially, SIFLoc improves the prediction accuracy for minor organelles significantly.

biochemical research methods,mathematical & computational biology
Deep localization of protein structures in fluorescence microscopy images

Muhammad Tahir,Saeed Anwar,Ajmal Mian,Abdul Wahab Muzaffar

DOI: https://doi.org/10.48550/arXiv.1910.04287

2021-10-08

Abstract:Accurate localization of proteins from fluorescence microscopy images is challenging due to the inter-class similarities and intra-class disparities introducing grave concerns in addressing multi-class classification problems. Conventional machine learning-based image prediction pipelines rely heavily on pre-processing such as normalization and segmentation followed by hand-crafted feature extraction to identify useful, informative, and application-specific features. Here, we demonstrate that deep learning-based pipelines can effectively classify protein images from different datasets. We propose an end-to-end Protein Localization Convolutional Neural Network (PLCNN) that classifies protein images more accurately and reliably. PLCNN processes raw imagery without involving any pre-processing steps and produces outputs without any customization or parameter adjustment for a particular dataset. Experimental analysis is performed on five benchmark datasets. PLCNN consistently outperformed the existing state-of-the-art approaches from traditional machine learning and deep architectures. This study highlights the importance of deep learning for the analysis of fluorescence microscopy protein imagery. The proposed deep pipeline can better guide drug designing procedures in the pharmaceutical industry and open new avenues for researchers in computational biology and bioinformatics.

Computer Vision and Pattern Recognition,Machine Learning,Image and Video Processing
A Novel Method for Protein Subcellular Localization Based on Boosting and Probabilistic Neural Network..

Jian Guo,Yuanlie Lin,Zhirong Sun

2004-01-01

Abstract:Subcellular localization is a key functional characteristic of proteins. An automatic, reliable and efficient prediction system for protein subcellular localization is needed for large-scale genome analysis. In this paper, we introduce a novel subcellular prediction method combining boosting algorithm with probabilistic neural network algorithm. This new approach provided superior prediction performance compared with existing methods. The total prediction accuracy on Reinhardt and Hubbard's dataset reached up to 92.8% for prokaryotic protein sequences and 81.4% for eukaryotic protein sequences under 5-fold cross validation. On our new dataset, the total accuracy achieved 83.2%. This novel method provides superior prediction performance compared with existing algorithms based on amino acid composition and can be a complementing method to other existing methods based on sorting singals.
Classification of X-Ray Protein Crystallization Using Deep Convolutional Neural Networks with a Finder Module

Yusei Miura,Tetsuya Sakurai,Claus Aranha,Toshiya Senda,Ryuichi Kato,Yusuke Yamada

DOI: https://doi.org/10.48550/arXiv.1812.10087

2018-12-25

Abstract:Recently, deep convolutional neural networks have shown good results for image recognition. In this paper, we use convolutional neural networks with a finder module, which discovers the important region for recognition and extracts that region. We propose applying our method to the recognition of protein crystals for X-ray structural analysis. In this analysis, it is necessary to recognize states of protein crystallization from a large number of images. There are several methods that realize protein crystallization recognition by using convolutional neural networks. In each method, large-scale data sets are required to recognize with high accuracy. In our data set, the number of images is not good enough for training CNN. The amount of data for CNN is a serious issue in various fields. Our method realizes high accuracy recognition with few images by discovering the region where the crystallization drop exists. We compared our crystallization image recognition method with a high precision method using Inception-V3. We demonstrate that our method is effective for crystallization images using several experiments. Our method gained the AUC value that is about 5% higher than the compared method.

Computer Vision and Pattern Recognition
Protein Subcellular Localization Based on PSI-BLAST and Machine Learning.

Jian Guo,Xian Pu,Yuanlie Lin,Howard Leung

DOI: https://doi.org/10.1142/s0219720006002405

2006-01-01

Journal of Bioinformatics and Computational Biology

Abstract:Subcellular location is an important functional annotation of proteins. An automatic, reliable and efficient prediction system for protein subcellular localization is necessary for large-scale genome analysis. This paper describes a protein subcellular localization method which extracts features from protein profiles rather than from amino acid sequences. The protein profile represents a protein family, discards part of the sequence information that is not conserved throughout the family and therefore is more sensitive than the amino acid sequence. The amino acid compositions of whole profile and the N-terminus of the profile are extracted, respectively, to train and test the probabilistic neural network classifiers. On two benchmark datasets, the overall accuracies of the proposed method reach 89.1% and 68.9%, respectively. The prediction results show that the proposed method perform better than those methods based on amino acid sequences. The prediction results of the proposed method are also compared with Subloc on two redundance-reduced datasets.
Locality Sensitive Deep Learning for Detection and Classification of Nuclei in Routine Colon Cancer Histology Images

Korsuk Sirinukunwattana,Shan E Ahmed Raza,Yee-Wah Tsang,David R. J. Snead,Ian A. Cree,Nasir M. Rajpoot

DOI: https://doi.org/10.1109/tmi.2016.2525803

IF: 10.6

2016-05-01

IEEE Transactions on Medical Imaging

Abstract:Detection and classification of cell nuclei in histopathology images of cancerous tissue stained with the standard hematoxylin and eosin stain is a challenging task due to cellular heterogeneity. Deep learning approaches have been shown to produce encouraging results on histopathology images in various studies. In this paper, we propose a Spatially Constrained Convolutional Neural Network (SC-CNN) to perform nucleus detection. SC-CNN regresses the likelihood of a pixel being the center of a nucleus, where high probability values are spatially constrained to locate in the vicinity of the centers of nuclei. For classification of nuclei, we propose a novel Neighboring Ensemble Predictor (NEP) coupled with CNN to more accurately predict the class label of detected cell nuclei. The proposed approaches for detection and classification do not require segmentation of nuclei. We have evaluated them on a large dataset of colorectal adenocarcinoma images, consisting of more than 20,000 annotated nuclei belonging to four different classes. Our results show that the joint detection and classification of the proposed SC-CNN and NEP produces the highest average F1 score as compared to other recently published approaches. Prospectively, the proposed methods could offer benefit to pathology practice in terms of quantitative analysis of tissue constituents in whole-slide images, and potentially lead to a better understanding of cancer.

engineering, biomedical,imaging science & photographic technology, electrical & electronic,computer science, interdisciplinary applications,radiology, nuclear medicine & medical imaging
SCLpred-EMS: subcellular localization prediction of endomembrane system and secretory pathway proteins by Deep N-to-1 Convolutional Neural Networks

Manaz Kaleel,Yandan Zheng,Jialiang Chen,Xuanming Feng,Jeremy C Simpson,Gianluca Pollastri,Catherine Mooney

DOI: https://doi.org/10.1093/bioinformatics/btaa156

IF: 5.8

2020-03-06

Bioinformatics

Abstract:Abstract Motivation The subcellular location of a protein can provide useful information for protein function prediction and drug design. Experimentally determining the subcellular location of a protein is an expensive and time-consuming task. Therefore, various computer-based tools have been developed, mostly using machine learning algorithms, to predict the subcellular location of proteins. Results Here, we present a neural network-based algorithm for protein subcellular location prediction. We introduce SCLpred-EMS a subcellular localization predictor powered by an ensemble of Deep N-to-1 Convolutional Neural Networks. SCLpred-EMS predicts the subcellular location of a protein into two classes, the endomembrane system and secretory pathway versus all others, with a Matthews correlation coefficient of 0.75–0.86 outperforming the other state-of-the-art web servers we tested. Availability and implementation SCLpred-EMS is freely available for academic users at http://distilldeep.ucd.ie/SCLpred2/. Contact catherine.mooney@ucd.ie

biochemical research methods,biotechnology & applied microbiology,mathematical & computational biology

Backbone 1H, 13C and 15N resonance assignments of the 39 kDa staphylococcal hemoglobin receptor IsdH

Protein Subcellular Localization Prediction by Concatenation of Convolutional Blocks for Deep Features Extraction From Microscopic Images

Extracting Cellular Location of Human Proteins Using Deep Learning

Imbalanced multi-modal multi-label learning for subcellular localization prediction of human proteins with both single and multiple sites

A multiresolution approach to automated classification of protein subcellular location images

PScL-DDCFPred: an ensemble deep learning-based approach for characterizing multiclass subcellular localization of human proteins from bioimage data

An Artificial Intelligence-Based Stacked Ensemble Approach for Prediction of Protein Subcellular Localization in Confocal Microscopy Images

Deep learning model for protein multi-label subcellular localization and function prediction based on multi-task collaborative training

Self-Supervised Deep Learning Encodes High-Resolution Features of Protein Subcellular Localization

Single-cell Subcellular Protein Localisation Using Novel Ensembles of Diverse Deep Architectures

Multilabel Learning for Protein Subcellular Location Prediction

Learning Consistent Subcellular Landmarks to Quantify Changes in Multiplexed Protein Maps.

Identify ncRNA Subcellular Localization via Graph Regularized k-Local Hyperplane Distance Nearest Neighbor Model on Multi-Kernel Learning

Subcellular Localization Prediction by Deep N-to-1 Convolutional Neural Networks

SIFLoc: a self-supervised pre-training method for enhancing the recognition of protein subcellular localization in immunofluorescence microscopic images

Deep localization of protein structures in fluorescence microscopy images

A Novel Method for Protein Subcellular Localization Based on Boosting and Probabilistic Neural Network..

Classification of X-Ray Protein Crystallization Using Deep Convolutional Neural Networks with a Finder Module

Protein Subcellular Localization Based on PSI-BLAST and Machine Learning.

Locality Sensitive Deep Learning for Detection and Classification of Nuclei in Routine Colon Cancer Histology Images

SCLpred-EMS: subcellular localization prediction of endomembrane system and secretory pathway proteins by Deep N-to-1 Convolutional Neural Networks