Hardware Latency-Aware Differential Architecture Search: Search for Latency-Friendly Architectures on Different Hardware

Ye Zhou,Dan Wang,Bin Song,Feng Gao
DOI: https://doi.org/10.2139/ssrn.4346398
2023-01-01
Abstract:As a result of its low search cost, Differentiable Architecture Search (DARTS) has recently received a lot of interest. Nowadays, most methods based on DARTS only focus on improving a single indicator (e.g., accuracy), making the search process more inclined to complex networks with more robust representational capabilities. Hence, the architectures searched by these methods tend to own high latency, leading to DARTS in some low-latency scenarios or edges with limited computing power, and on-device deployment becomes difficult. Intuitively, searching for an optimal latency-friendly architecture for devices with different computing power is an excellent solution to help the DARTS algorithm better deploy, but it is also a challenge. To deal with this challenge, we propose a Hardware Latency-Aware Differentiable Search (HL-DARTS) algorithm. This algorithm designs a multi-layer regression network that uses the soft attention mechanism to predict the latency on the corresponding hardware devices, thus adding a differentiable latency loss term based on the DARTS algorithm. We further propose an adaptive constraint amplitude—a mechanism for balancing accuracy and latency while searching for a latency-friendly architecture for a given hardware device. We conduct experiments in the CIFAR-10 dataset. The latency and number of parameters of the architectures searched by HL-DARTS are significantly decreased with nearly the same accuracy as the baseline DARTS algorithm. We also conduct ablation experiments on different datasets (CIFAR100, MIO-TCD, ImageNet) and different hardware devices (GPU, Intel-CPU, AMD-CPU). The experimental results show that HL-DARTS can find the ideal architecture for different hardware devices and that this architecture is also broadly applicable to various datasets.
What problem does this paper attempt to address?