Abstract:Zeroth-order (ZO) optimization has become a popular technique for solving machine learning (ML) problems when first-order (FO) information is difficult or impossible to obtain. However, the scalability of ZO optimization remains an open problem: Its use has primarily been limited to relatively small-scale ML problems, such as sample-wise adversarial attack generation. To our best knowledge, no prior work has demonstrated the effectiveness of ZO optimization in training deep neural networks (DNNs) without a significant decrease in performance. To overcome this roadblock, we develop DeepZero, a principled ZO deep learning (DL) framework that can scale ZO optimization to DNN training from scratch through three primary innovations. First, we demonstrate the advantages of coordinatewise gradient estimation (CGE) over randomized vector-wise gradient estimation in training accuracy and computational efficiency. Second, we propose a sparsityinduced ZO training protocol that extends the model pruning methodology using only finite differences to explore and exploit the sparse DL prior in CGE. Third, we develop the methods of feature reuse and forward parallelization to advance the practical implementations of ZO training. Our extensive experiments show that DeepZero achieves state-of-the-art (SOTA) accuracy on ResNet-20 trained on CIFAR-10, approaching FO training performance for the first time. Furthermore, we show the practical utility of DeepZero in applications of certified adversarial defense and DL-based partial differential equation error correction, achieving 10-20% improvement over SOTA. We believe our results will inspire future research on scalable ZO optimization and contribute to advancing DL with black box. Codes are available at <a class="link-external link-https" href="https://github.com/OPTML-Group/DeepZero" rel="external noopener nofollow">this https URL</a>.

DouRN: Improving DouZero by Residual Neural Networks

DouZero+: Improving DouDizhu AI by Opponent Modeling and Coach-guided Learning

Full DouZero+: Improving DouDizhu AI by Opponent Modeling, Coach-Guided Training and Bidding Learning

JP-DouZero: an enhanced DouDiZhu AI based on reinforcement learning with peasant collaboration and intrinsic rewards.

AlphaDou: High-Performance End-to-End Doudizhu AI Integrating Bidding

RARSMSDou: Master the Game of DouDiZhu With Deep Reinforcement Learning Algorithms

A Deep Reinforcement Learning-Based Approach in Porker Game

DanZero: Mastering GuanDan Game with Reinforcement Learning

DanZero+: Dominating the GuanDan Game through Reinforcement Learning

PerfectDou: Dominating DouDizhu with Perfect Information Distillation

Combinatorial Q-Learning for Dou Di Zhu.

Neural Auto-Curricula in Two-Player Zero-Sum Games.

DeepZero: Scaling up Zeroth-Order Optimization for Deep Model Training

Combining Tree Search and Action Prediction for State-of-the-Art Performance in DouDiZhu

ScrofaZero: Mastering Trick-taking Poker Game Gongzhu by Deep Reinforcement Learning

Efficient Learning for AlphaZero via Path Consistency.

Building a Computer Mahjong Player via Deep Convolutional Neural Networks

Neural Auto-Curricula

Enhanced LSTM‐DQN algorithm for a two‐player zero‐sum game in three‐dimensional space

Playing a FPS Doom Video Game with Deep Visual Reinforcement Learning

M$^2$DQN: A Robust Method for Accelerating Deep Q-learning Network