Abstract:Neural architecture search (NAS) has shown great promise in automatically designing neural network models. Recently, block-wise NAS has been proposed to alleviate deep coupling problem between architectures and weights existed in the well-known weight-sharing NAS, by training the huge weight-sharing supernet block-wisely. However, the existing block-wise NAS methods, which resort to either supervised distillation or self-supervised contrastive learning scheme to enable block-wise optimization, take massive computational cost. To be specific, the former introduces an external high-capacity teacher model, while the latter involves supernet-scale momentum model and requires a long training schedule. Considering this, in this work, we propose a resource-friendly deeply supervised block-wise NAS (DBNAS) method. In the proposed DBNAS, we construct a lightweight deeply-supervised module after each block to enable a simple supervised learning scheme and leverage ground-truth labels to indirectly supervise optimization of each block progressively. Besides, the deeply-supervised module is specifically designed as structural and functional condensation of the supernet, which establishes global awareness for progressive block-wise optimization and helps search for promising architectures. Experimental results show that the DBNAS method only takes less than 1 GPU day to search out promising architectures on the ImageNet dataset with less GPU memory footprint than the other block-wise NAS works. The best-performing model among the searched DBNAS family achieves 75.6% Top-1 accuracy on ImageNet, which is competitive with the state-of-the-art NAS models. Moreover, our DBNAS family models also achieve good transfer performance on CIFAR-10/100, as well as two downstream tasks: object detection and semantic segmentation.

Curriculum-NAS: Curriculum Weight-Sharing Neural Architecture Search

Breaking the Curse of Space Explosion: Towards Efficient NAS with Curriculum Search

DNA Family: Boosting Weight-Sharing NAS with Block-Wise Supervisions

K-shot NAS: Learnable Weight-Sharing for NAS with K-shot Supernets

Deeper Insights into Weight Sharing in Neural Architecture Search

CLOSE: Curriculum Learning on the Sharing Extent Towards Better One-Shot NAS

Disturbance-immune weight sharing for neural architecture search

Latency-aware Neural Architecture Performance Predictor with Query-to-Tier Technique

Dynamical Isometry based Rigorous Fair Neural Architecture Search

NASRec: Weight Sharing Neural Architecture Search for Recommender Systems

Generalized Global Ranking-Aware Neural Architecture Ranker for Efficient Image Classifier Search

Data-Augmented Curriculum Graph Neural Architecture Search under Distribution Shifts.

Deeply Supervised Block-Wise Neural Architecture Search

NAS-Bench-x11 and the Power of Learning Curves

How to Train Your Super-Net: An Analysis of Training Heuristics in Weight-Sharing NAS

PWSNAS: Powering Weight Sharing NAS With General Search Space Shrinking Framework

Weight-Entanglement Meets Gradient-Based Neural Architecture Search

NAS-LID: Efficient Neural Architecture Search with Local Intrinsic Dimension

Posterior-Guided Neural Architecture Search

PHD-NAS: Preserving helpful data to promote Neural Architecture Search

Fitting the Search Space of Weight-sharing NAS with Graph Convolutional Networks