Abstract:Deep learning networks have achieved great success in many areas, such as in large-scale image processing. They usually need large computing resources and time and process easy and hard samples inefficiently in the same way. Another undesirable problem is that the network generally needs to be retrained to learn new incoming data. Efforts have been made to reduce the computing resources and realize incremental learning by adjusting architectures, such as scalable effort classifiers, multi-grained cascade forest (gcForest), conditional deep learning (CDL), tree CNN, decision tree structure with knowledge transfer (ERDK), forest of decision trees with radial basis function (RBF) networks, and knowledge transfer (FDRK). In this article, a parallel multistage wide neural network (PMWNN) is presented. It is composed of multiple stages to classify different parts of data. First, a wide radial basis function (WRBF) network is designed to learn features efficiently in the wide direction. It can work on both vector and image instances and can be trained in one epoch using subsampling and least squares (LS). Second, successive stages of WRBF networks are combined to make up the PMWNN. Each stage focuses on the misclassified samples of the previous stage. It can stop growing at an early stage, and a stage can be added incrementally when new training data are acquired. Finally, the stages of the PMWNN can be tested in parallel, thus speeding up the testing process. To sum up, the proposed PMWNN network has the advantages of: 1) optimized computing resources; 2) incremental learning; and 3) parallel testing with stages. The experimental results with the MNIST data, a number of large hyperspectral remote sensing data, and different types of data in different application areas, including many image and nonimage datasets, show that the WRBF and PMWNN can work well on both image and nonimage data and have very competitive accuracy compared to learning models, such as stacked autoencoders, deep belief nets, support vector machine (SVM), multilayer perceptron (MLP), LeNet-5, RBF network, recently proposed CDL, broad learning, gcForest, ERDK, and FDRK.

Can Infinitely Wide Deep Nets Help Small-data Multi-label Learning?

Harnessing the Power of Infinitely Wide Deep Nets on Small-data Tasks

Markov-Lipschitz Deep Learning

Two-Stage Label Embedding Via Neural Factorization Machine for Multi-Label Classification

DWS-MKL: Depth-width-scaling multiple kernel learning for data classification

Infinite Width Limits of Self Supervised Neural Networks

Deep Multiple Instance Learning for Zero-Shot Image Tagging

Feature Learning in Infinite-Width Neural Networks

Deep Dependency Networks and Advanced Inference Schemes for Multi-Label Classification

Wide Neural Networks as Gaussian Processes: Lessons from Deep Equilibrium Models

Deep Learning for Extreme Multi-label Text Classification

Efficient NTK using Dimensionality Reduction

On Exact Computation with an Infinitely Wide Neural Net

Towards a General Theory of Infinite-Width Limits of Neural Classifiers

Large Margin Deep Neural Networks: Theory and Algorithms.

Parallel Multistage Wide Neural Network

Towards Understanding Deep Learning from Noisy Labels with Small-Loss Criterion

Generalization Ability of Wide Neural Networks on $\mathbb{R}$

A General Multiple Data Augmentation Based Framework for Training Deep Neural Networks

A Unified Kernel for Neural Network Learning

L_DMI: A Novel Information-theoretic Loss Function for Training Deep Nets Robust to Label Noise