CircuitNet: an Open-Source Dataset for Machine Learning in VLSI CAD Applications with Improved Domain-Specific Evaluation Metric and Learning Strategies.

Zhuomin Chai,Yuxiang Zhao,Wei Liu,Yibo Lin,Runsheng Wang,Ru Huang
DOI: https://doi.org/10.1109/tcad.2023.3287970
IF: 2.9
2023-01-01
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Abstract:The design automation community has been actively exploring machine learning (ML) for very-large-scale-integrated (VLSI) computer-aided design (CAD). Many studies have explored learning-based techniques for cross-stage prediction tasks in the design flow. Although building ML models usually requires a large amount of data, most studies can only generate small internal datasets for validation due to the lack of large public datasets. Such a situation challenges the research in this field and raises potential issues like difficulty in benchmarking and reproducing results, limited research scope on small internal datasets, and high bar for new researchers. Therefore, in this article, we present an open-source dataset called “CircuitNet” for ML tasks in VLSI CAD. The dataset consists of more than 10K samples extracted from versatile runs of commercial design tools based on six open-source RISC-V designs which support typical cross-stage prediction tasks, such as routability and IR drop prediction, with extensive benchmarking on recent models. With the dataset prepared, we identify two practical challenges, data imbalance and model transferability, for ML application in CAD. To overcome data imbalance, we propose a loss function, biased loss, to give more weight to the minority, leading to 2% congestion reduction in routability-driven placement. We test the model transferability from RISC-V designs to ISPD 2015 contest designs in congestion prediction with several transfer learning methods and further proposed a knowledge distillation-based transfer learning framework with up to 20% accuracy improvement. We believe this dataset can open up new opportunities for ML in CAD research and beyond.
What problem does this paper attempt to address?