A Diagnostic Procedure for High-Dimensional Data Streams via Missed Discovery Rate Control

Wendong Li,Dongdong Xiang,Fugee Tsung,Xiaolong Pu
DOI: https://doi.org/10.1080/00401706.2019.1575284
2019-05-24
Technometrics
Abstract:Monitoring complex systems involving high-dimensional data streams (HDS) provides quick real-time detection of abnormal changes of system performance, but accurate and efficient diagnosis of the streams responsible has also become increasingly important in many data-rich statistical process control applications. Existing diagnostic procedures, designed for low/moderate dimensional multivariate process, may miss too much important information in the out-of-control streams with a high signal-to-noise ratio (SNR) or waste too many resources finding useless in-control streams with a low SNR. In addition, these procedures do not differentiate between streams according to their severity. In this article, we formulate the diagnosis problem of HDS as a multiple testing problem and provide a computationally fast diagnostic procedure to control the weighted missed discovery rate (wMDR) at some satisfactory level. The proposed procedure overcomes the limitations of conventional diagnostic procedures by controlling the wMDR and minimizing the expected number of false positives as well. We show theoretically that the proposed procedure is asymptotically valid and optimal in a certain sense. Simulation studies and a real data analysis from a semiconductor manufacturing process show that the proposed procedure works very well in practice.
statistics & probability
What problem does this paper attempt to address?