Learning Curves for Analysis of Deep Networks

Derek Hoiem,Tanmay Gupta,Zhizhong Li,Michal M. Shlapentokh-Rothman
DOI: https://doi.org/10.48550/arXiv.2010.11029
2021-04-06
Abstract:Learning curves model a classifier's test error as a function of the number of training samples. Prior works show that learning curves can be used to select model parameters and extrapolate performance. We investigate how to use learning curves to evaluate design choices, such as pretraining, architecture, and data augmentation. We propose a method to robustly estimate learning curves, abstract their parameters into error and data-reliance, and evaluate the effectiveness of different parameterizations. Our experiments exemplify use of learning curves for analysis and yield several interesting observations.
Machine Learning,Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: how to use learning curves to evaluate and compare different design choices of deep networks, such as pre - training, architecture, data augmentation, etc., and provide a systematic method to measure the performance change of classifiers under different numbers of training samples. Specifically, the author points out that there is an obstacle in current machine - learning research, namely the lack of a method to systematically measure and report performance changes with the amount of training data. This particularly affects the development of areas such as representation learning, data augmentation, and few - shot learning, because these areas are particularly concerned with model performance under limited data conditions. To improve this situation, the author proposes and demonstrates a method for evaluating classifier performance using learning curves. ### Main problem summary: 1. **Lack of systematic evaluation methods**: - Most current evaluation methods use a fixed training / testing split and cannot comprehensively reflect the model's performance under different amounts of data. 2. **Need for better performance measurement standards**: - In order to design better classifiers, especially in cases with different data availability, more accurate standards for measuring learning ability are required. 3. **Explore the impact of design choices**: - How to evaluate the impact of design choices such as pre - training, network architecture, optimization methods, depth, width, fine - tuning, data augmentation, etc. on model performance through learning curves. ### Solutions: - **Introduce learning curves**: The learning curve models the test error as a function of the number of training samples, in the form of \( e_{\text{test}}(n)=\alpha+\eta n^{\gamma} \), where \( n \) is the number of training samples. - **Parameter abstraction**: For ease of comparison, the author proposes two key parameters \( e_N \) and \( \beta_N \), representing the test error and data dependence at \( n = N \), respectively. - **Experimental verification**: Through a series of experiments, the effectiveness and stability of the learning curve under different design choices are verified, and new insights into these choices are provided. ### Key contributions: - **Modeling and estimating learning curves**: A robust method for estimating learning curves is proposed and parameterized as error and data dependence. - **Analyze the impact of design choices**: Through learning - curve analysis, the specific impacts of different design choices (such as network architecture, optimization methods, pre - training, data augmentation, etc.) on model performance are demonstrated. In summary, this paper aims to provide a systematic method for evaluating and comparing design choices of deep networks by introducing learning curves, so as to better understand the model's performance under different amounts of data.