Adaptive Activation Functions for Predictive Modeling with Sparse Experimental Data

Farhad Pourkamali-Anaraki,Tahamina Nasrin,Robert E. Jensen,Amy M. Peterson,Christopher J. Hansen

2024-02-08

Abstract:A pivotal aspect in the design of neural networks lies in selecting activation functions, crucial for introducing nonlinear structures that capture intricate input-output patterns. While the effectiveness of adaptive or trainable activation functions has been studied in domains with ample data, like image classification problems, significant gaps persist in understanding their influence on classification accuracy and predictive uncertainty in settings characterized by limited data availability. This research aims to address these gaps by investigating the use of two types of adaptive activation functions. These functions incorporate shared and individual trainable parameters per hidden layer and are examined in three testbeds derived from additive manufacturing problems containing fewer than one hundred training instances. Our investigation reveals that adaptive activation functions, such as Exponential Linear Unit (ELU) and Softplus, with individual trainable parameters, result in accurate and confident prediction models that outperform fixed-shape activation functions and the less flexible method of using identical trainable activation functions in a hidden layer. Therefore, this work presents an elegant way of facilitating the design of adaptive neural networks in scientific and engineering problems.

Neural and Evolutionary Computing,Machine Learning

What problem does this paper attempt to address?

The paper attempts to address the issue of the application effectiveness of adaptive activation functions in neural networks under conditions of sparse experimental data (i.e., small sample sizes). Specifically, the research focuses on the following aspects: 1. **Evaluating the effectiveness of adaptive activation functions**: Compared to traditional fixed-shape activation functions (such as ELU, Softplus, and Swish), whether adaptive activation functions (with trainable parameters) can improve prediction accuracy and model confidence. The study explores different scenarios of sharing activation functions within hidden layers versus assigning independent trainable parameters to each unit. 2. **Effectiveness on small sample datasets**: For the first time, systematically exploring the performance of adaptive activation functions in application scenarios with fewer than 100 training samples. Through three different additive manufacturing problems as case studies, the applicability and superiority of adaptive activation functions in such data-scarce environments are verified. 3. **Quantifying prediction uncertainty**: In addition to relying on traditional classification accuracy metrics, the study introduces the concept of prediction sets, using conformal inference methods to generate prediction intervals, and evaluates the impact of adaptive activation functions on neural network prediction uncertainty through two metrics: empirical coverage and the average size of prediction sets. 4. **Providing code implementation**: To facilitate practitioners in applying adaptive activation functions under limited data conditions, the authors provide source code implementations of hidden layer activation functions with shared and independent trainable parameters. Through this research, the paper aims to fill the current gap in understanding the application effectiveness of adaptive activation functions in small sample data environments and provide new insights and tool support for researchers in the scientific and engineering fields constrained by limited labeled data.

Adaptive Activation Functions for Predictive Modeling with Sparse Experimental Data

Efficient Spiking Neural Networks with Sparse Selective Activation for Continual Learning

Learning Activation Functions for Sparse Neural Networks

Bayesian optimization for sparse neural networks with trainable activation functions

Adaptive Blending Units: Trainable Activation Functions for Deep Neural Networks

Activation Adaptation in Neural Networks

Learning Neural Networks with Sparse Activations

EIS - Efficient and Trainable Activation Functions for Better Accuracy and Performance

A novel activation function for multilayer feed-forward neural networks

Locally adaptive activation functions with slope recovery term for deep and physics-informed neural networks

Trainable Highly-expressive Activation Functions

Evolving activation dynamics in feedforward neural networks

Adaptive Parametric Activation

An Efficient Asymmetric Nonlinear Activation Function for Deep Neural Networks

Web-aided data set expansion in deep learning: evaluating trainable activation functions in ResNet for improved image classification

Neural Networks with Activation Networks

APALU: A Trainable, Adaptive Activation Function for Deep Learning Networks

Neuro-Inspired Deep Neural Networks with Sparse, Strong Activations

Effect of Activation Functions on the Training of Overparametrized Neural Nets

An overview of the activation functions used in deep learning algorithms

Sparse neural network regression with variable selection