Abstract:In the present paper, deep convolutional neural network (DCNN) is applied to multilocus protein subcellular localization as it is more suitable for multi-class classification. There are two main problems with this application. First, the appropriate features for correlation between multiple sites are hard to find. Second, the classifier structure is difficult to determine as it is greatly affected by the distribution of classified data. To solve these problems, a self-evoluting framework using DCNNs for multilocus protein subcellular localization is proposed. It has three characteristics that the previous algorithms do not. The first is that it combines the ant colony algorithm with the DCNN to form a self-evoluting algorithm for multilocus protein subcellular localization. The second is that it randomly groups subcellular sites using a limited random k-labelsets multi-label classification method. It also solves complex problems in a divide-and-conquer approach and proposes a flexible expansion model. The third is that it realizes the random selection feature extraction method in the positioning process and avoids the defects in individual feature extraction methods. The algorithm in the present paper is tested on the human database, and the overall correct rate is 67.17%, which is higher than that for the stacked self-encoder (SAE), support vector machine (SVM), random forest classifier (RF), or single deep convolutional neural network.Graphical abstract The algorithm mentioned in the present paper mainly includes four parts. They are protein sequence data preprocessing, integrated DCNN model construction, finding optimal DCNN combination by ant colony optimization, and protein subcellular localization for sequences. These parts are sequential relationships and the data obtained in the previous part is the basis for the latter part of the function. In the part of data preprocessing, the limited RAkEL multi-label classification method is used to randomly group subcellular sites. At the same time, the feature fusion of protein sequences is carried out by using multiple feature extraction methods. Each combination including features and sites information corresponds to a DCNN model. In the part of finding optimal DCNN combination by ant colony optimization, the main purpose is to find the best combination of DCNN models through the global optimization ability of the ant colony algorithm. The positioning of sequences is mainly to obtain multilocus subcellular localization by the optimal model combination.

Multi-labelled Proteins Recognition for High-Throughput Microscopy Images Using Deep Convolutional Neural Networks.

Classifying Mixed Patterns of Proteins in High-Throughput Microscopy Images Using Deep Neural Networks

AMC-Net: Asymmetric and Multi-Scale Convolutional Neural Network for Multi-Label HPA Classification.

Protein Subcellular Localization Prediction by Concatenation of Convolutional Blocks for Deep Features Extraction From Microscopic Images

A Multi-Scale Multi-Model Deep Neural Network Via Ensemble Strategy on High-Throughput Microscopy Image for Protein Subcellular Localization

An Artificial Intelligence-Based Stacked Ensemble Approach for Prediction of Protein Subcellular Localization in Confocal Microscopy Images

ProteoNet: A CNN-based Framework for Analyzing Proteomics MS-RGB Images

AnnoPRO: an Innovative Strategy for Protein Function Annotation Based on Image-like Protein Representation and Multimodal Deep Learning

Extracting Cellular Location of Human Proteins Using Deep Learning

Multi-scale Deep Learning for the Imbalanced Multi-Label Protein Subcellular Localization Prediction Based on Immunohistochemistry Images

Deep localization of protein structures in fluorescence microscopy images

DEEPGONET: Multi-label Prediction of GO Annotation for Protein from Sequence Using Cascaded Convolutional and Recurrent Network

Prediction of Protein Subcellular Localization Based on Microscopic Images via Multi-Task Multi-Instance Learning

ImPLoc: a Multi-Instance Deep Learning Model for the Prediction of Protein Subcellular Localization Based on Immunohistochemistry Images.

Incorporating Label Correlations into Deep Neural Networks to Classify Protein Subcellular Location Patterns in Immunohistochemistry Images

Using Deep Convolutional Neural Networks to Circumvent Morphological Feature Specification when Classifying Subvisible Protein Aggregates from Micro-Flow Images

Self-evoluting framework of deep convolutional neural network for multilocus protein subcellular localization

Near perfect protein multi-label classification with deep neural networks

DeepIso: A Deep Learning Model for Peptide Feature Detection

Convolutional Neural Network-Based Artificial Intelligence for Classification of Protein Localization Patterns

Protein Remote Homology Detection Based on Deep Convolutional Neural Network