Abstract:Managing the configurations of a database system poses significant challenges due to the multitude of configuration knobs that impact various system <a class="link-external link-http" href="http://aspects.The" rel="external noopener nofollow">this http URL</a> lack of standardization, independence, and universality among these knobs further complicates the task of determining the optimal <a class="link-external link-http" href="http://settings.To" rel="external noopener nofollow">this http URL</a> address this issue, an automated solution leveraging supervised and unsupervised machine learning techniques was <a class="link-external link-http" href="http://developed.This" rel="external noopener nofollow">this http URL</a> solution aims to identify influential knobs, analyze previously unseen workloads, and provide recommendations for knob <a class="link-external link-http" href="http://settings.The" rel="external noopener nofollow">this http URL</a> effectiveness of this approach is demonstrated through the evaluation of a new tool called OtterTune [1] on three different database management systems (DBMSs).The results indicate that OtterTune's recommendations are comparable to or even surpass the configurations generated by existing tools or human <a class="link-external link-http" href="http://experts.In" rel="external noopener nofollow">this http URL</a> this study, we build upon the automated technique introduced in the original OtterTune paper, utilizing previously collected training data to optimize new DBMS <a class="link-external link-http" href="http://deployments.By" rel="external noopener nofollow">this http URL</a> employing supervised and unsupervised machine learning methods, we focus on improving latency <a class="link-external link-http" href="http://prediction.Our" rel="external noopener nofollow">this http URL</a> approach expands upon the methods proposed in the original paper by incorporating GMM clustering to streamline metrics selection and combining ensemble models (such as RandomForest) with non-linear models (like neural networks) for more accurate prediction modeling.

ConEx: Efficient Exploration of Big-Data System Configurations for Better Performance

BestConfig: Tapping the Performance Potential of Systems Via Automatic Configuration Tuning

Towards Optimal Concolic Testing

An Adaptive Auto-configuration Tool for Hadoop

Convergence Depth Control for Process System Optimization

DeepCAT+: A Low-Cost and Transferrable Online Configuration Auto-Tuning Approach for Big Data Frameworks

Autonomic Architecture for Big Data Performance Optimization

CSE Framework: A UIMA-based Distributed System for Configuration Space Exploration.

E3: an Elastic Execution Engine for Scalable Data Processing.

Using Bad Learners to find Good Configurations

Building optimal information systems automatically: configuration space exploration for biomedical information systems.

Efficient Identification of Approximate Best Configuration of Training in Large Datasets

KEA: Tuning an Exabyte-Scale Data Infrastructure

Evolutionary computation for solving search-based data analytics problems

Sequential Design of Computer Experiments with Quantitative and Qualitative Factors in Applications to HPC Performance Optimization

Performance Improvement of Distributed Systems by Autotuning of the Configuration Parameters

ContTune: Continuous Tuning by Conservative Bayesian Optimization for Distributed Stream Data Processing Systems

Scaling Multiobjective Evolution to Large Data With Minions: A Bayes-Informed Multitask Approach

Utilizing deep learning for automated tuning of database management systems

QHB+: Accelerated Configuration Optimization for Automated Performance Tuning of Spark SQL Applications

ConfigX: Modular Configuration for Evolutionary Algorithms via Multitask Reinforcement Learning