Optimization of deep learning models: benchmark and analysis

Rasheed Ahmad,Izzat Alsmadi,Mohammad Al-Ramahi
DOI: https://doi.org/10.1007/s43674-023-00055-1
2023-03-30
Advances in Computational Intelligence
Abstract:Model optimization in deep learning (DL) and neural networks is concerned about how and why the model can be successfully trained towards one or more objective functions. The evolutionary learning or training process continuously considers the dynamic parameters of the model. Many researchers propose a deep learning-based solution by randomly selecting a single classifier model architecture. Such approaches generally overlook the hidden and complex nature of the model’s internal working, producing biased results. Larger and deeper NN models bring many complexities and logistic challenges while building and deploying them. To obtain high-quality performance results, an optimal model generally depends on the appropriate architectural settings, such as the number of hidden layers and the number of neurons at each layer. A challenging and time-consuming task is to select and test various combinations of these settings manually. This paper presents an extensive empirical analysis of various deep learning algorithms trained recursively using permutated settings to establish benchmarks and find an optimal model. The paper analyzed the Stack Overflow dataset to predict the quality of posted questions. The extensive empirical analysis revealed that some famous deep learning algorithms such as CNN are the least effective algorithm in solving this problem compared to multilayer perceptron (MLP), which provides efficient computing and the best results in terms of prediction accuracy. The analysis also shows that manipulating the number of neurons alone at each layer in a network does not influence model optimization. This paper’s findings will help to recognize the fact that future models should be built by considering a vast range of model architectural settings for an optimal solution.
What problem does this paper attempt to address?