Abstract:Humans sometimes have an insight that leads to a sudden and drastic performance improvement on the task they are working on. Sudden strategy adaptations are often linked to insights, considered to be a unique aspect of human cognition tied to complex processes such as creativity or meta-cognitive reasoning. Here, we take a learning perspective and ask whether insight-like behaviour can occur in simple artificial neural networks, even when the models only learn to form input-output associations through gradual gradient descent. We compared learning dynamics in humans and regularised neural networks in a perceptual decision task that included a hidden regularity to solve the task more efficiently. Our results show that only some humans discover this regularity, whose behaviour was marked by a sudden and abrupt strategy switch that reflects an aha-moment. Notably, we find that simple neural networks with a gradual learning rule and a constant learning rate closely mimicked behavioural characteristics of human insight-like switches, exhibiting delay of insight, suddenness and selective occurrence in only some networks. Analyses of network architectures and learning dynamics revealed that insight-like behaviour crucially depended on a regularised gating mechanism and noise added to gradient updates, which allowed the networks to accumulate "silent knowledge" that is initially suppressed by regularised gating. This suggests that insight-like behaviour can arise from gradual learning in simple neural networks, where it reflects the combined influences of noise, gating and regularisation. These results have potential implications for more complex systems, such as the brain, and guide the way for future insight research. Insights, or aha-moments, are a remarkable phenomenon in human cognition that is unique in a number of ways: they are accompanied by a powerful subjective experience, occur abruptly after an unpredictable period of having been stuck on a problem, and for some arise never. But are insights harbingers of a unique mode of learning that only appears in the highest of animals? We show that insight- like behaviours can occur even in the simplest neural networks trained with regular gradient descent techniques. Human and machine behaviour was compared on the same decision making task which included a hidden regularity. Neural networks with L1-regularised gate modulation closely mimic the key behavioural characteristics that we identified in humans. An analysis of the insight networks showed that noise and regularisation played an important part in bringing about insight-like behaviour, besides being preceded by "silent knowledge" that is initially suppressed by gating. Our results shed new light on the computational origins of insights and suggest that they can arise from gradual learning mechanisms.

Early learning of the optimal constant solution in neural networks and humans

From Lazy to Rich: Exact Learning Dynamics in Deep Linear Networks

The Surprising Simplicity of the Early-Time Learning Dynamics of Neural Networks

Get rich quick: exact solutions reveal how unbalanced initializations promote rapid feature learning

Attentional Bias in Human Category Learning: The Case of Deep Learning

Abrupt and spontaneous strategy switches emerge in simple regularised neural networks

Critical Learning Periods Emerge Even in Deep Linear Networks

Towards learning-to-learn

Deep Predictive Learning in Neocortex and Pulvinar

The large learning rate phase of deep learning: the catapult mechanism

Learning Inductive Biases with Simple Neural Networks

Early alignment in two-layer networks training is a two-edged sword

Deconstructing the Goldilocks Zone of Neural Network Initialization

Critical Learning Periods in Deep Neural Networks

Understanding Dynamics of Nonlinear Representation Learning and Its Application

A simple normative network approximates local non-Hebbian learning in the cortex

On the Complexity of Learning Neural Networks

Relational Constraints On Neural Networks Reproduce Human Biases towards Abstract Geometric Regularity

Towards the One Learning Algorithm Hypothesis: A System-theoretic Approach

Evaluating (and Improving) the Correspondence Between Deep Neural Networks and Human Representations