Factor-Bounded Nonnegative Matrix Factorization

Kai Liu,Xiangyu Li,Zhihui Zhu,Lodewijk Brand,Hua Wang
DOI: https://doi.org/10.1145/3451395
IF: 4.157
2021-05-19
ACM Transactions on Knowledge Discovery from Data
Abstract:Nonnegative Matrix Factorization (NMF) is broadly used to determine class membership in a variety of clustering applications. From movie recommendations and image clustering to visual feature extractions, NMF has applications to solve a large number of knowledge discovery and data mining problems. Traditional optimization methods, such as the Multiplicative Updating Algorithm (MUA), solves the NMF problem by utilizing an auxiliary function to ensure that the objective monotonically decreases. Although the objective in MUA converges, there exists no proof to show that the learned matrix factors converge as well. Without this rigorous analysis, the clustering performance and stability of the NMF algorithms cannot be guaranteed. To address this knowledge gap, in this article, we study the factor-bounded NMF problem and provide a solution algorithm with proven convergence by rigorous mathematical analysis, which ensures that both the objective and matrix factors converge. In addition, we show the relationship between MUA and our solution followed by an analysis of the convergence of MUA. Experiments on both toy data and real-world datasets validate the correctness of our proposed method and its utility as an effective clustering algorithm.
computer science, information systems, software engineering
What problem does this paper attempt to address?
The problems that this paper attempts to solve are several key issues existing in the optimization process of the existing Non - negative Matrix Factorization (NMF) algorithms, specifically including: 1. **Convergence problem**: Traditional NMF optimization methods, such as the Multiplicative Updating Algorithm (MUA), can ensure the monotonic decrease of the objective function, but lack a strict mathematical proof to ensure that the learned matrix factors (i.e., the decomposed matrices \(F\) and \(G\)) themselves also converge. This results in the clustering performance and stability of the NMF algorithm not being fully guaranteed. 2. **Local optimal solution problem**: Existing NMF algorithms are prone to fall into local optimal solutions, especially when dealing with high - dimensional data, and this problem is more serious. This limits the effectiveness of NMF in practical applications. 3. **Uniqueness of solution problem**: Due to the existence of soft labeling, the solution of NMF is usually not unique, which further affects the stability and interpretability of the algorithm. To address these problems, the paper proposes a new method - Factor - Bounded Nonnegative Matrix Factorization (FB - NMF). FB - NMF ensures the convergence of matrix factors by introducing upper and lower bound constraints on the factor matrices and improves the robustness and interpretability of the algorithm. Specifically, the objective of FB - NMF is to minimize the objective function under the following constraint conditions: \[ \min_{F, G} h(F, G)=\frac{1}{2}\|X - FG\|_F^2 \] where \(F\) and \(G\) satisfy the following constraint conditions: \[ 0\leq f_l\leq F_{ij}\leq f_u,\quad 0\leq g_l\leq G_{ij}\leq g_u \] These constraint conditions not only help to prevent the values of matrix factors from becoming too large or too small, thereby improving the stability of the algorithm, but also can provide better interpretability in practical applications. For example, in image analysis, the element values of the feature matrix \(F\) are usually between 0 and 255, which makes it easier to interpret its meaning. In addition, the paper also provides a strict mathematical proof to ensure the global sequential convergence of the objective function and matrix factors. Experimental results show that FB - NMF performs better than existing NMF methods on toy datasets and real - world datasets, especially when dealing with data containing outliers, FB - NMF can provide more reasonable clustering results.