Stale Profile Matching

Amir Ayupov,Maksim Panchenko,Sergey Pupyrev
2024-01-31
Abstract:Profile-guided optimizations rely on profile data for directing compilers to generate optimized code. To achieve the maximum performance boost, profile data needs to be collected on the same version of the binary that is being optimized. In practice however, there is typically a gap between the profile collection and the release, which makes a portion of the profile invalid for optimizations. This phenomenon is known as profile staleness, and it is a serious practical problem for data-center workloads both for compilers and binary optimizers.
Programming Languages,Software Engineering
What problem does this paper attempt to address?
What problem does this paper attempt to solve? This paper mainly studies a practical problem in Profile - Guided Optimization (PGO), namely the **profile staleness problem**. Specifically: 1. **Background and challenges**: - PGO depends on profile data collected during program runtime to guide the compiler to generate optimized code. - In order to achieve the maximum performance improvement, the profile data needs to be collected on the same version as the binary file to be optimized. - However, in practical applications, there is usually a time lag between the collection and release of profile data, which makes some of the profile data invalid or no longer suitable for optimization. This phenomenon is called profile staleness. 2. **Severity of the problem**: - Profile staleness has a particularly significant impact on data - center workloads, and both compilers and binary optimizers will encounter this problem. - Experimental data shows that between two consecutive versions, up to 70% of the profile samples may become stale; and in the case of a three - week delayed update of the profile, this proportion even exceeds 92%. This has led to a significant decline in the effectiveness of optimization tools such as BOLT. 3. **Solutions**: - The paper proposes a new method to handle stale profile data so that it can still be used for optimization. - The authors have developed a two - stage algorithm that can be implemented in the mainstream open - source post - link optimizer BOLT and can handle large - scale production binary files without significantly increasing runtime overhead. - Through experimental evaluation, the new method can recover 0.6 to 0.8 of the maximum BOLT acceleration effect when most of the input profile data is stale. 4. **Contributions**: - A formal model of profile staleness has been proposed, and the corresponding two - stage algorithm has been developed. - The algorithm has been implemented in BOLT and has been extensively experimentally verified. - The experimental results show that even when most of the profile data is stale, the new method can still significantly recover the optimization effect. In summary, this paper aims to solve the problem of the decline in optimization effectiveness due to stale profile data in profile - guided optimization, and provides an effective solution to enable stale profile data to still be effectively utilized.