Diversity in Neural Architecture Search

Wenzheng Hu,Mingyang Li,Changhe Yuan,Changshui Zhang,Jianqiang Wang
DOI: https://doi.org/10.1109/ijcnn48605.2020.9206793
2020-01-01
Abstract:Neural architecture search (NAS) is usually divided into two phases: model search, where candidate architectures go through an early training for a small number of epochs (e.g., 20) and a search strategy is used to find one or multiple top candidates, and model tuning, where the top candidates are trained fully (e.g., for 600 epochs) and one final best architecture is chosen. The top M-best strategy (M-Best) is typically used to help find better candidates during model search. However, the top M best solutions may concentrate in narrow similar areas and do not have enough diversity. Furthermore, empirical evidence suggests that performance distribution of the models which only go through the early training does not have a strong correlation with that of the models trained fully. Therefore, many of the M best solutions may turn out to be sub-optimal simultaneously because of their similarity, which limits the ability to find true top architectures. To alleviate the problems, we define diverse M-best architectures that are both of high quality and sufficiently different from each other based on a novel graph-based architecture distance. The concept is very general and is applicable to existing architecture search methods using top M-Best. To the best of our knowledge, this is the first time that diversity is introduced into architecture search. We applied the method in the progressive neural architecture search (PNAS) algorithm (Liu et al. 2018a). Experimental results show that our diverse M-Best is indeed beneficial for finding better architectures.
What problem does this paper attempt to address?