Abstract:Neural networks, as powerful tools for data mining and knowledge engineering, can learn from data to build feature‐based classifiers and nonlinear predictive models. Training neural networks involves the optimization of nonconvex objective functions, and usually, the learning process is costly and infeasible for applications associated with data streams. A possible, albeit counterintuitive, alternative is to randomly assign a subset of the networks’ weights so that the resulting optimization task can be formulated as a linear least‐squares problem. This methodology can be applied to both feedforward and recurrent networks, and similar techniques can be used to approximate kernel functions. Many experimental results indicate that such randomized models can reach sound performance compared to fully adaptable ones, with a number of favorable benefits, including (1) simplicity of implementation, (2) faster learning with less intervention from human beings, and (3) possibility of leveraging overall linear regression and classification algorithms (e.g., ℓ 1 norm minimization for obtaining sparse formulations). This class of neural networks attractive and valuable to the data mining community, particularly for handling large scale data mining in real‐time. However, the literature in the field is extremely vast and fragmented, with many results being reintroduced multiple times under different names. This overview aims to provide a self‐contained, uniform introduction to the different ways in which randomization can be applied to the design of neural networks and kernel functions. A clear exposition of the basic framework underlying all these approaches helps to clarify innovative lines of research, open problems, and most importantly, foster the exchanges of well‐known results throughout different communities. WIREs Data Mining Knowl Discov 2017, 7:e1200. doi: 10.1002/widm.1200This article is categorized under: Technologies > Machine Learning

Learning from Randomly Initialized Neural Network Features

Get rich quick: exact solutions reveal how unbalanced initializations promote rapid feature learning

Randomness in Neural Networks: an Overview

Critical feature learning in deep neural networks

Local Kernel Renormalization as a mechanism for feature learning in overparametrized Convolutional Neural Networks

How to Initialize your Network? Robust Initialization for WeightNorm & ResNets

Random Projection in Deep Neural Networks

From Activation to Initialization: Scaling Insights for Optimizing Neural Fields

Randomly Initialized One-Layer Neural Networks Make Data Linearly Separable

How Spurious Features Are Memorized: Precise Analysis for Random and NTK Features

Nonuniform random feature models using derivative information

A Random Matrix Theory Perspective on the Spectrum of Learned Features and Asymptotic Generalization Capabilities

Batch Normalization Provably Avoids Rank Collapse for Randomly Initialised Deep Networks

The Surprising Simplicity of the Early-Time Learning Dynamics of Neural Networks

Feature Learning and Generalization in Deep Networks with Orthogonal Weights

Dynamics of finite width Kernel and prediction fluctuations in mean field neural networks *

On Random Kernels of Residual Architectures

Self-Consistent Dynamical Field Theory of Kernel Evolution in Wide Neural Networks

The Surprising Power of Graph Neural Networks with Random Node Initialization

Low-dimensional Intrinsic Dimension Reveals a Phase Transition in Gradient-Based Learning of Deep Neural Networks

Deep Neural Networks Tend To Extrapolate Predictably