Significance of activation functions in developing an online classifier for semiconductor defect detection

Kain Lu Low,Jieming Pan,Joydeep Ghosh,Min Wu,Xiaoli Li,Aaron Voon-Yew Thean,J. Senthilnath,Md Meftahul Ferdaus,Zhou Bangjian,Yoon Ji Wei
DOI: https://doi.org/10.1016/j.knosys.2022.108818
2022-04-01
Abstract:In anomaly detection problems for advanced semiconductor devices, non-visual defects occur frequently. Machine learning (ML) algorithms have the advantage of identifying such defects. However, in this real-world problem, data comes sequentially in a streaming fashion, thus, we may not have sufficient data to train an ML model in batch mode. In such a scenario, online ML models are useful to detect defects immediately since they work in a single-pass mode. Besides, when data is collected from more realistic non-stationary monitoring environments, online ML models with evolving architecture are more practical. Thus, evolving and online ML models are developed in this work to detect defects in technology computer-aided design (TCAD)-based digital twin model of advanced nano-scaled semiconductor devices such as a fin field-effect transistor (FinFET) and a gate-all-around field-effect transistor (GAA-FET). Activation functions (AFs) in deep neural networks (DNNs) and membership functions (MFs) in neuro-fuzzy systems (NFSs) play an important role in the performance of those ML models. This work focuses on analysing the effects of various AFs/MFs in our developed online ML models while detecting defects in real-world nano-scaled semiconductor devices, where significant training samples are not available. From various semiconductor datasets having fewer samples, it has been observed that the proposed evolving neuro-fuzzy system (ENFS) with Leaky-ReLU MF performs better (improvement in the range of 1.9% to 30.8% considering overall classification accuracy) than the other DNN or ENFS-based online ML models. Having an evolving architecture and online learning mechanism, besides anomaly detection, the proposed model's performance has also been evaluated for handling large data streams problems with concept drift. The performance of the proposed method has been compared with some recently developed baselines under the prequential test-then-train protocol. The classification rates of the proposed method has an improvement in the range of 1.1% to 65.9% than the existing methods. The code of this work has been made publicly available at https://github.com/MdFerdaus/LREC.
computer science, artificial intelligence
What problem does this paper attempt to address?