Cross-Modality Program Representation Learning for Electronic Design Automation with High-Level Synthesis

Zongyue Qin,Yunsheng Bai,Atefeh Sohrabizadeh,Zijian Ding,Ziniu Hu,Yizhou Sun,Jason Cong
2024-07-18
Abstract:In recent years, domain-specific accelerators (DSAs) have gained popularity for applications such as deep learning and autonomous driving. To facilitate DSA designs, programmers use high-level synthesis (HLS) to compile a high-level description written in C/C++ into a design with low-level hardware description languages that eventually synthesize DSAs on circuits. However, creating a high-quality HLS design still demands significant domain knowledge, particularly in microarchitecture decisions expressed as \textit{pragmas}. Thus, it is desirable to automate such decisions with the help of machine learning for predicting the quality of HLS designs, requiring a deeper understanding of the program that consists of original code and pragmas. Naturally, these programs can be considered as sequence data. In addition, these programs can be compiled and converted into a control data flow graph (CDFG). But existing works either fail to leverage both modalities or combine the two in shallow or coarse ways. We propose ProgSG, a model that allows interaction between the source code sequence modality and the graph modality in a deep and fine-grained way. To alleviate the scarcity of labeled designs, a pre-training method is proposed based on a suite of compiler's data flow analysis tasks. Experimental results show that ProgSG reduces the RMSE of design performance predictions by up to $22\%$, and identifies designs with an average of $1.10\times$ and $1.26\times$ (up to $8.17\times$ and $13.31\times$) performance improvement in design space exploration (DSE) task compared to HARP and AutoDSE, respectively.
Machine Learning,Artificial Intelligence,Hardware Architecture
What problem does this paper attempt to address?
The paper focuses on cross-modal program representation learning in electronic design automation (EDA), specifically the design of domain-specific accelerators (DSA) for applications such as deep learning and autonomous driving. In these designs, programmers use high-level synthesis (HLS) to transform high-level descriptions written in C/C++ into low-level hardware description languages for circuit synthesis of DSAs. However, creating high-quality HLS designs still requires a significant amount of domain expertise, especially in microarchitecture decisions, which are often expressed through pragmas. The paper proposes a model called P ROG SG that can deeply integrate the source code sequence modality and control data flow graph (CDFG) modality. Existing works either fail to fully exploit these two modalities or only superficially or coarsely combine them. P ROG SG addresses this issue by using an attention summarization architecture and a fine-grained node-to-token message passing mechanism to facilitate interaction between the two modalities. To alleviate the scarcity of labeled designs, the paper also proposes a pre-training method based on compiler data flow analysis tasks. Experimental results show that compared to existing methods HARP and AutoDSE, P ROG SG reduces the root mean square error (RMSE) in design performance prediction by up to 22% and improves performance in design space exploration tasks by an average of 1.10 times and 1.26 times (up to 8.17 times and 13.31 times). Overall, the goal of the paper is to make the process of machine learning automation and accelerating IC design optimization, particularly for software programmers, easier. The P ROG SG model improves representation learning in HLS design by integrating the advantages of source code and CDFG, enabling more accurate prediction of design quality and effective exploration of design space.