Multi-objective Hardware-aware Neural Architecture Search with Pareto Rank-preserving Surrogate Models

Hadjer Benmeziane,Hamza Ouarnoughi,Kaoutar El Maghraoui,Smail Niar
DOI: https://doi.org/10.1145/3579853
IF: 1.444
2023-04-15
ACM Transactions on Architecture and Code Optimization
Abstract:Deep learning (DL) models such as convolutional neural networks (ConvNets) are being deployed to solve various computer vision and natural language processing tasks at the edge. It is a challenge to find the right DL architecture that simultaneously meets the accuracy, power, and performance budgets of such resource-constrained devices. Hardware-aware Neural Architecture Search (HW-NAS) has recently gained steam by automating the design of efficient DL models for a variety of target hardware platforms. However, such algorithms require excessive computational resources. Thousands of GPU days are required to evaluate and explore an architecture search space such as FBNet [ 45 ]. State-of-the-art approaches propose using surrogate models to predict architecture accuracy and hardware performance to speed up HW-NAS. Existing approaches use independent surrogate models to estimate each objective, resulting in non-optimal Pareto fronts. In this article, HW-PR-NAS, 1 a novel Pareto rank-preserving surrogate model for edge computing platforms, is presented. Our model integrates a new loss function that ranks the architectures according to their Pareto rank, regardless of the actual values of the various objectives. We employ a simple yet effective surrogate model architecture that can be generalized to any standard DL model. We then present an optimized evolutionary algorithm that uses and validates our surrogate model. Our approach has been evaluated on seven edge hardware platforms from various classes, including ASIC, FPGA, GPU, and multi-core CPU. The evaluation results show that HW-PR-NAS achieves up to 2.5× speedup compared to state-of-the-art methods while achieving 98% near the actual Pareto front.
computer science, theory & methods, hardware & architecture
What problem does this paper attempt to address?