Hierarchical Multiple Kernel K-Means Algorithm Based on Sparse Connectivity

Lei Wang,Liang Du,Peng Zhou
DOI: https://doi.org/10.11896/jsjx.22040023
2024-10-27
Abstract:Multiple kernel learning (MKL) aims to find an optimal, consistent kernel function. In the hierarchical multiple kernel clustering (HMKC) algorithm, sample features are extracted layer by layer from a high-dimensional space to maximize the retention of effective information. However, information interaction between layers is often ignored. In this model, only corresponding nodes in adjacent layers exchange information; other nodes remain isolated, and if full connectivity is adopted, the diversity of the final consistency matrix is reduced. Therefore, this paper proposes a hierarchical multiple kernel K-Means (SCHMKKM) algorithm based on sparse connectivity, which controls the assignment matrix to achieve sparse connections through a sparsity rate, thereby locally fusing the features obtained by distilling information between layers. Finally, we conduct cluster analysis on multiple datasets and compare it with the fully connected hierarchical multiple kernel K-Means (FCHMKKM) algorithm in experiments. It is shown that more discriminative information fusion is beneficial for learning a better consistent partition matrix, and the fusion strategy based on sparse connection outperforms the full connection strategy.
Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: in Hierarchical Multiple Kernel Clustering (HMKC), the information interaction between layers is ignored, which leads to the limitation of the diversity of the final consensus matrix and thus affects the clustering performance. Specifically: 1. **Information Interaction Problem**: In the HMKC algorithm, only the corresponding nodes in adjacent layers will carry out information interaction, while other nodes are isolated. This design ignores the broader information interaction between different layers and may lead to the loss of effective information. 2. **Fully - Connected Problem**: If the fully - connected method is adopted, although the information interaction can be enhanced, it will weaken the diversity of the final consensus matrix and thus affect the clustering effect. To solve the above problems, the paper proposes a Sparse Connectivity Hierarchical Multiple Kernel K - Means algorithm (SCHMKKM). By introducing a sparse connectivity strategy, this algorithm increases the diversity of information fusion between layers while maintaining information interaction, thereby improving the clustering performance. ### Main Contributions: 1. **Proposing a Sparse Connectivity Strategy**: The fusion of local discriminant information between layers is realized by controlling the distribution matrix through the sparsity rate. 2. **Experimental Verification**: Through comparative experiments on multiple datasets, the superiority of the sparse connectivity strategy over the fully - connected strategy in clustering performance is proved. 3. **Theoretical Analysis**: Theoretically analyze how the sparse connectivity strategy increases the diversity of the consensus partition matrix and further improves the clustering effect. Through these improvements, the SCHMKKM algorithm can improve the accuracy and robustness of clustering while retaining effective information.