Compiler Autotuning through Multiple Phase Learning

Mingxuan Zhu,Dan Hao,Junjie Chen
DOI: https://doi.org/10.1145/3640330
IF: 3.685
2024-01-11
ACM Transactions on Software Engineering and Methodology
Abstract:Widely used compilers like GCC and LLVM usually have hundreds of optimizations controlled by optimization flags, which are enabled or disabled during compilation to improve runtime performance (e.g., small execution time) of the compiler program. Due to the large number of optimization flags and their combination, it is difficult for compiler users to manually tune compiler optimization flags. In the literature, a number of autotuning techniques have been proposed, which tune optimization flags for a compiled program by comparing its actual runtime performance with different optimization flag combination. Due to the huge search space and heavy actual runtime cost, these techniques suffer from the widely-recognized efficiency problem. To reduce the heavy runtime cost, in this paper we propose a lightweight learning approach which uses a small number of actual runtime performance data to predict the runtime performance of a compiled program with various optimization flag combinations. Furthermore, to reduce the search space, we design a novel particle swarm algorithm which tunes compiler optimization flags with the prediction model. To evaluate the performance of the proposed approach CompTuner, we conduct an extensive experimental study on two popular C compilers GCC and LLVM with two widely used benchmarks cBench and PolyBench. The experimental results show that CompTuner significantly outperforms the six compared techniques, including the state-of-art technique BOCA.
computer science, software engineering
What problem does this paper attempt to address?
The paper attempts to address the challenges of manual tuning of compiler optimization flags and the efficiency issues of existing automatic tuning techniques. Specifically: 1. **Challenges of Manual Tuning**: Modern compilers like GCC and LLVM have hundreds of optimization options that can be enabled or disabled to improve the runtime performance of the compiled program (e.g., reducing execution time). Due to the large number of optimization options and their complex combinations, it is difficult for users to manually select the appropriate optimization flags. 2. **Efficiency Issues of Existing Automatic Tuning Techniques**: Although existing automatic tuning techniques can automatically select optimization flags, these techniques usually require a large amount of actual runtime data for performance comparison, leading to a huge search space and severe time consumption. For example, unsupervised learning methods explore the combination space of optimization flags through search strategies (such as hill climbing or genetic algorithms) and select the best combination based on actual runtime performance. This approach is very time-consuming. To alleviate these issues, the paper proposes a multi-stage learning-based compiler automatic tuning technique—CompTuner. CompTuner improves the efficiency and accuracy of automatic tuning through the following two main steps: 1. **Prediction Model Construction**: CompTuner constructs a lightweight prediction model through multi-stage learning, which can predict the runtime performance of different optimization flag combinations using a small amount of actual runtime data. Specifically, CompTuner first performs initial learning with a small number of randomly generated optimization flag combinations, and then gradually improves the accuracy of the prediction model through multiple stages of enhanced learning. 2. **Intelligent Search**: CompTuner uses an improved Particle Swarm Optimization (PSO) algorithm to search for the best optimization flag combination under the guidance of the prediction model. This algorithm balances local search and global search by adjusting parameters, thereby avoiding local optima and improving search efficiency. Through these two steps, CompTuner can significantly reduce the actual runtime cost while improving the accuracy of compiler optimization flag selection, thereby enhancing the runtime performance of the compiled program.