Initialization of CNN Models for Training on a Small Dataset Using Importance of Filter Parameters

Jien Kato,Guanwen Zhang,Yu Wang
DOI: https://doi.org/10.1527/tjsai.c-g42
2017-01-01
Transactions of the Japanese Society for Artificial Intelligence
Abstract:Deep Convolutional Neural Networks (CNNs) have achieved great success in many computer vision tasks. However, it is still difficult to use them in practical tasks, especially small scale tasks, because of the large quantity of labeled training data that are required in their training process. In this paper, we present two approaches to enable easy adaption of CNNs in small scale tasks: theMinimum Entropy Loss (MEL) approach and theMinimumReconstruction Error (MRE) approach. The basic idea of these two approaches is to select informative filters in pre-trained CNN models, and reuse them to initialize CNNs that are designed for small scale tasks. Different with popular fineturning approach which also reuses pre-trained CNNs by conducting further training without changing their model architectures, MEL and MRE lead to an easy usage of pre-trained models in novel model architectures. This makes it a high flexibility when dealing with small scale tasks. We evaluated the performance of the two approaches on practical small scale tasks, and confirmed their high performance and high flexibility.
What problem does this paper attempt to address?