Auptimizer -- an Extensible, Open-Source Framework for Hyperparameter Tuning

Jiayi Liu,Samarth Tripathi,Unmesh Kurup,Mohak Shah
DOI: https://doi.org/10.48550/arXiv.1911.02522
2019-11-07
Abstract:Tuning machine learning models at scale, especially finding the right hyperparameter values, can be difficult and time-consuming. In addition to the computational effort required, this process also requires some ancillary efforts including engineering tasks (e.g., job scheduling) as well as more mundane tasks (e.g., keeping track of the various parameters and associated results). We present Auptimizer, a general Hyperparameter Optimization (HPO) framework to help data scientists speed up model tuning and bookkeeping. With Auptimizer, users can use all available computing resources in distributed settings for model training. The user-friendly system design simplifies creating, controlling, and tracking of a typical machine learning project. The design also allows researchers to integrate new HPO algorithms. To demonstrate its flexibility, we show how Auptimizer integrates a few major HPO techniques (from random search to neural architecture search). The code is available at <a class="link-external link-https" href="https://github.com/LGE-ARC-AdvancedAI/auptimizer" rel="external noopener nofollow">this https URL</a>.
Machine Learning
What problem does this paper attempt to address?
### What problems does this paper attempt to solve? This paper aims to solve the challenges encountered by machine - learning models during large - scale parameter tuning, especially finding appropriate hyper - parameter values. Specifically, the paper proposes a framework named Auptimizer to help data scientists accelerate model tuning and record management. The following are the main problems that the paper attempts to solve: 1. **Insufficient utilization of computing resources**: - Large - scale hyper - parameter optimization (HPO) requires a large amount of computing resources, but existing HPO tools are usually unable to fully utilize all available resources in a distributed computing environment. 2. **Heavy engineering tasks**: - There are many auxiliary tasks involved in the HPO process, such as job scheduling, tracking various parameters and their associated results, etc., which increase the workload of engineers. 3. **Difficulty in algorithm switching**: - Different HPO algorithms have different interfaces and configuration requirements, and users need to frequently modify the code when trying different algorithms, which increases the cost of adopting new algorithms. 4. **Lack of scalability and flexibility**: - Existing HPO tools face challenges when integrating new HPO algorithms or adapting to new computing resources, especially difficult to scale in a multi - node environment. 5. **Result tracking and reproducibility**: - During the HPO process, it is very important to maintain the reproducibility of experimental results, but existing tools do not do well in this regard. To solve these problems, the paper proposes the Auptimizer framework, which has the following features: - **Simplify the use and switching of HPO algorithms**: Users can easily switch between different HPO algorithms without significantly modifying the code. - **Provide scalability for cloud and local resources**: Support distributed computing environments and fully utilize cloud computing and local computing resources. - **Simplify the integration of new HPO algorithms and resource schedulers**: Through a consistent API design, it becomes simple to integrate new algorithms and schedulers. - **Ensure the reproducibility of results**: Automatically record all configurations and results during the experimental process for convenient subsequent analysis and verification. The design goal of Auptimizer is to improve the automation degree and efficiency of the HPO process, while reducing the user's usage threshold, enabling it to better meet the complex requirements in practical applications.