A Preliminary Study on Data Augmentation of Deep Learning for Image Classification

Cheng Lei,Benlin Hu,Dong Wang,Shu Zhang,Zhenyu Chen
DOI: https://doi.org/10.1145/3361242.3361259
2019-01-01
Abstract:Deep learning models have a large number of free parameters that need to be calculated by effective training of the models on a great deal of training data to improve their generalization performance. However, data obtaining and labeling is expensive in practice. Data augmentation is one of the methods to alleviate this problem. In this paper, we conduct a preliminary study on how four variables (augmentation method, augmentation rate, size of basic dataset per label, and method combination) can affect the accuracy of deep learning for image classification. The study provides some guidelines: (1) altering the geometry of the images is not always better than those just lighting and color. (2) 2-3 times augmentation rate is good enough for training. (3) the combination of two geometry methods degrade the performance, while combinations with at least one photometric method, will improve the performance, especially when one method is a photometric method and another is a geometry method. (4) the sequence of methods in combination has little effect on the performance.
What problem does this paper attempt to address?