Bucketed PCA Neural Networks with Neurons Mirroring Signals

Jackie Shen
DOI: https://doi.org/10.48550/arXiv.2108.00605
2021-08-02
Abstract:The bucketed PCA neural network (PCA-NN) with transforms is developed here in an effort to benchmark deep neural networks (DNN's), for problems on supervised classification. Most classical PCA models apply PCA to the entire training data set to establish a reductive representation and then employ non-network tools such as high-order polynomial classifiers. In contrast, the bucketed PCA-NN applies PCA to individual buckets which are constructed in two consecutive phases, as well as retains a genuine architecture of a neural network. This facilitates a fair apple-to-apple comparison to DNN's, esp. to reveal that a major chunk of accuracy achieved by many impressive DNN's could possibly be explained by the bucketed PCA-NN (e.g., 96% out of 98% for the MNIST data set as an example). Compared with most DNN's, the three building blocks of the bucketed PCA-NN are easier to comprehend conceptually - PCA, transforms, and bucketing for error correction. Furthermore, unlike the somewhat quasi-random neurons ubiquitously observed in DNN's, the PCA neurons resemble or mirror the input signals and are more straightforward to decipher as a result.
Machine Learning,Artificial Intelligence,Optimization and Control
What problem does this paper attempt to address?
There are two main problems that this paper attempts to solve: 1. **Provide a benchmark model for deep neural networks (DNN)**: - The author developed a neural network based on principal component analysis (PCA - NN) to benchmark against mainstream deep neural networks (DNN). Although traditional parametric models (such as linear regression or logistic regression) can also be used as benchmarks for DNN, they often lack explanatory power due to significant architectural differences. PCA - NN retains the true architecture of the neural network, enabling it to be fairly compared with DNN. - Through this comparison, the author hopes to reveal how much of the high accuracy of DNN can be explained by the classic PCA framework. For example, on the MNIST dataset, PCA - NN can explain up to 96% of the accuracy of DNN. 2. **Construct a neural network that is easy to interpret**: - The author aims to create a neural network whose neurons are more easily interpretable. This is related to the current new trend of "explainable AI". Especially in the financial industry, from chief risk officers to front - desk quantitative traders, there is a general requirement for models to have a certain degree of interpretability. - PCA neurons naturally reflect the common structure of the input signals and their main variation characteristics, so they are more intuitive and easier to understand than the neurons in DNN. ### Specific contributions of the paper To achieve the above goals, the author proposed the following three main components: 1. **Construct neurons directly through PCA**: - Apply PCA to each "bucket" of training samples instead of the entire training set. Each "bucket" corresponds to a specific category or label. 2. **Neuron transformation**: - Extract more signal features by transforming existing neurons. For example, enhance the ability to recognize handwritten digits in different directions through rotation operations. 3. **Error correction through binning**: - Analyze and utilize misclassified samples to further improve the model. For example, for those samples that are misclassified as other categories, reconstruct neurons through PCA to improve classification accuracy. Through these methods, the author demonstrated that the final accuracy of PCA - NN on the MNIST dataset reached more than 96%, and the entire design process is completely transparent and easy to interpret.