Abstract:The Internet environment has provided massive data to the actual industrial production process. It not only has large amounts of data but also has a high data dimension, which brings challenges to the traditional statistical process monitoring. Aiming at the nonlinearity and dynamics of industrial large-scale high-dimensional data, an efficient iterative multiple dynamic kernel principal component analysis (IMDKPCA) method is proposed to monitor the complex industrial process with super-large-scale high-dimensional data. In KPCA, a new KK<sup>T</sup> matrix is first created by using kernel matrix K. According to the properties of the symmetric matrix, the newly constructed matrix has the same eigenvector as the original matrix K; hence, each column of the matrix K can be used as the input sample of the iteration algorithm. After iterative operation, the kernel principal component can be deduced fleetly without the eigen decomposition. Because the kernel matrix is not stored in the algorithm beforehand, it can effectively reduce the computation complexity of the kernel. Especially for a tremendous data scale, the traditional eigen decomposition technology is no longer appropriate, yet the presented method can be solved quickly. The autoregressive moving average (ARMA) time series model and kernel principal component analysis (KPCA) are combined to build the IDKPCA model for dealing with the dynamics and nonlinearity in the industrial process. Eventually, it is applied to monitor faults in the penicillin fermentation process and compared with MKPCA to certify the accuracy and applicability of the proposed method.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: in view of the complexity of large - scale high - dimensional data sets in industry, propose an effective method to monitor the nonlinear and dynamic characteristics in the batch process with ultra - large - scale data sets. Specifically, traditional statistical process monitoring methods face challenges when dealing with large amounts of data and high - dimensional data, especially in dealing with nonlinear and dynamic characteristics. Therefore, this paper proposes an efficient iterative multiple dynamic kernel principal component analysis (IMDKPCA) method to meet these challenges. ### Specific description of the problem 1. **Large data and high - dimension**: The data generated in modern industrial production processes are not only large in quantity but also high in dimension, which brings great challenges to traditional statistical process monitoring methods. 2. **Nonlinear and dynamic characteristics**: The data in industrial processes usually have nonlinear and dynamic characteristics, which make it difficult for traditional linear methods to deal with effectively. 3. **Computational complexity**: Traditional methods such as kernel principal component analysis (KPCA) need to perform eigenvalue decomposition and matrix inversion operations. When the data scale is very large, these operations will lead to a huge computational burden and may even become infeasible. ### Proposed solutions In order to overcome the above problems, this paper proposes the IMDKPCA method, and its main features include: - **Avoid eigenvalue decomposition**: By constructing a new \(K^T K\) matrix and using the properties of symmetric matrices, directly extract input samples from the columns of the kernel matrix for iterative operations, thereby avoiding eigenvalue decomposition and greatly reducing the computational complexity. - **Combine ARMA model**: Combine the autoregressive moving average (ARMA) time - series model with KPCA to construct the IDKPCA model to deal with the dynamic and nonlinear problems in industrial processes. - **Applicable to ultra - large - scale data sets**: For ultra - large - scale data sets, this method can be quickly solved without pre - storing the kernel matrix, thereby effectively reducing the computational complexity. ### Application examples This method is applied to the fault monitoring of the penicillin fermentation process, and a comparative experiment with multi - way kernel principal component analysis (MKPCA) is carried out to verify its accuracy and applicability. ### Conclusion Experiments on the penicillin fermentation process prove that the IMDKPCA method shows higher efficiency and accuracy in dealing with the nonlinear and dynamic characteristics of large - scale high - dimensional data sets, especially having obvious advantages in real - time monitoring and fault detection.

Efficient Iterative Dynamic Kernel Principal Component Analysis Monitoring Method for the Batch Process with Super-large-scale Data Sets

Multivariate Statistical Process Monitoring of an Industrial Polypropylene Catalyzer Reactor with Component Analysis and Kernel Density Estimation

Block Adaptive Kernel Principal Component Analysis for Nonlinear Process Monitoring

Moving window kernel PCA for adaptive monitoring of nonlinear processes

Nonlinear Process Monitoring Based on Improved Kernel Ica

Adaptive Kpca Modeling of Nonlinear Systems

Recursive kernel PCA and its application in adaptive monitoring of nonlinear processes

Dynamic PCA-based Fault Detection and Diagnosis Analysis

On-line Monitoring of Penicillin Production Process Based on Multiway Kernel Principal Component Analysis

Combination Of Independent Component Analysis And Multi-Way Principal Component Analysis For Batch Process Monitoring

Nonlinear Process Monitoring Using Improved Kernel Principal Component Analysis

On-Line Monitoring of Batch Processes Using Generalized Additive Kernel Principal Component Analysis

The Application of Dynamic Principal Component Analysis to Enhance Chunk Monitoring of an Industrial Fluidized-Bed Reactor

Author Response for "distributed Process Monitoring for Large‐scale Processes Based on MJMI‐weighted DKPCA"

Nonlinear Batch Process Monitoring Using Phase-Based Kernel-Independent Component Analysis-Principal Component Analysis (Kica-Pca)

Learning a Data-Dependent Kernel Function for KPCA-based Nonlinear Process Monitoring

Kernel Generalization of Multi-Rate Probabilistic Principal Component Analysis for Fault Detection in Nonlinear Process

Nonlinear Process Monitoring Based on Maximum Variance Unfolding Projections.

Improved kernel PCA-based monitoring approach for nonlinear processes

Improvement of Kernel Principal Component Analysis-Based Approach for Nonlinear Process Monitoring by Data Set Size Reduction Using Class Interval

Distributed Parallel PCA for Modeling and Monitoring of Large-Scale Plant-Wide Processes with Big Data.