Strategies for Conceptual Change in Convolutional Neural Networks

Maarten Grachten,Carlos Eduardo Cancino Chacón
DOI: https://doi.org/10.48550/arXiv.1711.01634
2019-06-25
Abstract:A remarkable feature of human beings is their capacity for creative behaviour, referring to their ability to react to problems in ways that are novel, surprising, and useful. Transformational creativity is a form of creativity where the creative behaviour is induced by a transformation of the actor's conceptual space, that is, the representational system with which the actor interprets its environment. In this report, we focus on ways of adapting systems of learned representations as they switch from performing one task to performing another. We describe an experimental comparison of multiple strategies for adaptation of learned features, and evaluate how effectively each of these strategies realizes the adaptation, in terms of the amount of training, and in terms of their ability to cope with restricted availability of training data. We show, among other things, that across handwritten digits, natural images, and classical music, adaptive strategies are systematically more effective than a baseline method that starts learning from scratch.
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to make the Convolutional Neural Network (CNN) effectively adapt to new tasks when switching from one task to another. Specifically, the researchers focus on how to adjust the learned representation methods to cope with changes in the environment or tasks, so as to achieve more efficient retraining and better performance. ### Problem Description 1. **Adaptive Representation and Creative Behavior**: - The researchers explored how to achieve creative behavior by changing the representation methods of the Convolutional Neural Network. Creative behavior refers to an individual's ability to deal with problems in a novel, surprising, and useful way. This ability can be induced by transforming the individual's conceptual space (i.e., the representation system for interpreting the environment). 2. **Concept - Shift Strategies**: - The paper focuses on how to adjust the learned representation methods when the model switches from one task to another. It describes multiple strategies for adapting the learned features and evaluates the effectiveness of each strategy, including the amount of training required and performance with limited training data. 3. **Experimental Design**: - The author compared multiple adaptation strategies through experiments and demonstrated the effects of these strategies on different datasets such as handwritten digits, natural images, and classical music. The results show that adaptive strategies are generally more effective than the baseline method of learning from scratch. ### Mathematical Formula Representation - **Convolution Layer Output**: \[ y_i^{(l)} = f_l\left(\sum_{j = 1}^{m_1^{(l - 1)}} W_{i,j}^{(l)} * y_j^{(l - 1)}+ B_i^{(l)}\right) \] where \( * \) represents the convolution operation, \( W_{i,j}^{(l)} \) is the convolution kernel connecting the \( j \)-th feature map in the \( l - 1 \) -th layer and the \( i \)-th feature map in the \( l \)-th layer, \( B_i^{(l)} \) is the bias matrix, and \( f_l \) is the element - wise nonlinear activation function. - **Fully Connected Layer Output**: \[ y^{(l)} = f_l(W^{(l)}y^{(l - 1)}+ b^{(l)}) \] where \( W^{(l)} \) is the filter connecting the \( l - 1 \) -th layer and the \( l \)-th layer, \( b^{(l)} \) is the bias vector, and \( f_l \) is the element - wise nonlinear activation function. - **Loss Function Minimization**: \[ \hat{\theta}=\arg\min_{\theta}L(\theta) \] where \( L(\theta) \) is the loss function, and for multi - class classification problems, the commonly used loss function is the average classification cross - entropy. ### Summary This paper aims to explore and evaluate the adaptation strategies of Convolutional Neural Networks when facing task changes, in order to achieve more efficient task switching and better performance. Through experimental verification, the adaptive strategies perform well on multiple datasets and are superior to the baseline method of training from scratch.