Abstract:Machine/deep learning models have been widely adopted for predicting the configuration performance of software systems. However, a crucial yet unaddressed challenge is how to cater for the sparsity inherited from the configuration landscape: the influence of configuration options (features) and the distribution of data samples are highly sparse. In this paper, we propose a model-agnostic and sparsity-robust framework for predicting configuration performance, dubbed DaL, based on the new paradigm of dividable learning that builds a model via "divide-and-learn". To handle sample sparsity, the samples from the configuration landscape are divided into distant divisions, for each of which we build a sparse local model, e.g., regularized Hierarchical Interaction Neural Network, to deal with the feature sparsity. A newly given configuration would then be assigned to the right model of division for the final prediction. Further, DaL adaptively determines the optimal number of divisions required for a system and sample size without any extra training or profiling. Experiment results from 12 real-world systems and five sets of training data reveal that, compared with the state-of-the-art approaches, DaL performs no worse than the best counterpart on 44 out of 60 cases with up to 1.61x improvement on accuracy; requires fewer samples to reach the same/better accuracy; and producing acceptable training overhead. In particular, the mechanism that adapted the parameter d can reach the optimal value for 76.43% of the individual runs. The result also confirms that the paradigm of dividable learning is more suitable than other similar paradigms such as ensemble learning for predicting configuration performance. Practically, DaL considerably improves different global models when using them as the underlying local models, which further strengthens its flexibility.

Using Bad Learners to find Good Configurations

Learning Probabilistic Models for Model Checking: an Evolutionary Approach and an Empirical Study

BestConfig: Tapping the Performance Potential of Systems Via Automatic Configuration Tuning

Transfer learning for performance modeling of configurable systems: An exploratory analysis

Transfer Learning for Improving Model Predictions in Highly Configurable Software

White-Box Performance-Influence Models: A Profiling and Learning Approach

Deep Configuration Performance Learning: A Systematic Survey and Taxonomy

Predicting Configuration Performance in Multiple Environments with Sequential Meta-learning

Unlocking the Secrets of Software Configuration Landscapes-Ruggedness, Accessibility, Escapability, and Transferability

CM-CASL: Comparison-based performance modeling of software systems via collaborative active and semisupervised learning

ConEx: Efficient Exploration of Big-Data System Configurations for Better Performance

Identifying Performance-Sensitive Configurations in Software Systems through Code Analysis with LLM Agents

White-Box Analysis over Machine Learning: Modeling Performance of Configurable Systems

Learning Software Configuration Spaces: A Systematic Literature Review

Dividable Configuration Performance Learning

Algorithm Configuration: Learning policies for the quick termination of poor performers

Effect of Human Learning on the Transient Performance of Cloud-based Tiered Applications

On Using Retrained and Incremental Machine Learning for Modeling Performance of Adaptable Software: An Empirical Comparison

Pushing the Boundary: Specialising Deep Configuration Performance Learning

Learn to Optimize - A Brief Overview

Sequential Model-Based Optimization for General Algorithm Configuration