OSSP-PTA: an Online Stochastic Stepping Policy for PTA on Reinforcement Learning

Dan Niu,Yichao Dong,Zhou Jin,Chuan Zhang,Qi Li,Changyin Sun
DOI: https://doi.org/10.1109/tcad.2023.3251731
IF: 2.9
2023-01-01
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Abstract:The dc analysis is essential and still quite challenging in large-scale nonlinear circuit simulation. Pseudo transient analysis (PTA) is a widely used and has great potential solver in the industry. However, the PTA convergence and simulation efficiency is still seriously affected by its stepping policy. This article proposes an online stochastic stepping policy (OSSP) for PTA based on deep reinforcement learning (DRL). To achieve better policy evaluation and stronger stepping exploration ability, the dual soft Actor–Critic agents work with the proposed valuation splitting and online momental scaling, enabling our OSSP to intelligently encode PTA iteration status and online further adjust forward and backward time-step size for unseen test circuits without human intervention and domain knowledge, trained solely by reinforcement learning from self-search. Our public sample buffer and priority sampling are also introduced to overcome the sparsity and imbalance of sample data. Numerical examples demonstrate that the proposed OSSP achieves a significant efficiency speedup (up to $47.0\times $ less Newton–Raphson iterations) and convergence enhancement on unseen test circuits compared with the previous iter-based and switched evolution/relaxation-based stepping methods, in just one stepping iteration.
What problem does this paper attempt to address?