Abstract:Neural Architecture Search (NAS), aiming at automatically designing network architectures by machines, is expected to bring about a new revolution in machine learning. Despite these high expectation, the effectiveness and efficiency of existing NAS solutions are unclear, with some recent works going so far as to suggest that many existing NAS solutions are no better than random architecture selection. The ineffectiveness of NAS solutions may be attributed to inaccurate architecture evaluation. Specifically, to speed up NAS, recent works have proposed under-training different candidate architectures in a large search space concurrently by using shared network parameters; however, this has resulted in incorrect architecture ratings and furthered the ineffectiveness of NAS. In this work, we propose to modularize the large search space of NAS into blocks to ensure that the potential candidate architectures are fully trained; this reduces the representation shift caused by the shared parameters and leads to the correct rating of the candidates. Thanks to the block-wise search, we can also evaluate all of the candidate architectures within each block. Moreover, we find that the knowledge of a network model lies not only in the network parameters but also in the network architecture. Therefore, we propose to distill the neural architecture (DNA) knowledge from a teacher model to supervise our block-wise architecture search, which significantly improves the effectiveness of NAS. Remarkably, the performance of our searched architectures has exceeded the teacher model, demonstrating the practicability of our method. Finally, our method achieves a state-of-the-art 78.4% top-1 accuracy on ImageNet in a mobile setting. All of our searched models along with the evaluation code are available at https://github.com/changlin31/DNA.

Understanding and Exploring the Network with Stochastic Architectures.

SNAS: Stochastic Neural Architecture Search

Understanding Architectures Learnt by Cell-based Neural Architecture Search

Block-wisely Supervised Neural Architecture Search with Knowledge Distillation

Pi-NAS: Improving Neural Architecture Search by Reducing Supernet Training Consistency Shift

A Technical View on Neural Architecture Search

A Comprehensive Survey of Neural Architecture Search: Challenges and Solutions

Boosting Order-Preserving and Transferability for Neural Architecture Search: a Joint Architecture Refined Search and Fine-tuning Approach

Surrogate-assisted Evolutionary Neural Architecture Search with Network Embedding

A Feedback-inspired Super-network Shrinking Framework for Flexible Neural Architecture Search

Understanding and Accelerating Neural Architecture Search with Training-Free and Theory-Grounded Metrics

Understanding Neural Architecture Search Techniques

Disentangled Neural Architecture Search

MNGNAS: Distilling Adaptive Combination of Multiple Searched Networks for One-Shot Neural Architecture Search

Generalization Properties of NAS under Activation and Skip Connection Search

One-Shot Neural Architecture Search: Maximising Diversity to Overcome Catastrophic Forgetting

Progressive DARTS: Bridging the Optimization Gap for NAS in the Wild

Learning Reliable Neural Networks with Distributed Architecture Representations

Differentiable Neural Architecture Search in Equivalent Space with Exploration Enhancement.

Architecture Disentanglement for Deep Neural Networks

Fine-Grained Stochastic Architecture Search