Nice to meet images with Big Clusters and Features: A cluster-weighted multi-modal co-clustering method
Chaoyang Zhang,Hang Xue,Kai Nie,Xihui Wu,Zhengzheng Lou,Shouyi Yang,Qinglei Zhou,Shizhe Hu
DOI: https://doi.org/10.1016/j.ipm.2024.103735
IF: 7.466
2024-05-20
Information Processing & Management
Abstract:Multi-modal image clustering focuses on exploring and exploiting related information across various modals of input images to obtain clear image cluster patterns. Recent multi-modal/view clustering methods have shown promising performance in solving the image clustering problem. However, most existing methods fail to properly deal with the multi-modal image data with massive number of clusters and high dimensionality of features in real-world applications, such as image retrieval, multi-modal autonomous driving perception and industrial automation. We call this problem "Big Clusters and Features" for short just as "Big Data" for large number of samples. To fix this challenging problem, we in this article design a general multi-modal image clustering framework which integrates cluster-wise weight learning, feature learning, and clustering structure learning. Under this framework, we further propose a new Cluster-weighted Multi-modal Information Bottleneck Co-clustering (CMIBC) method, and it could effectively measure image cluster importance information and discriminative features of each modal for obtaining satisfactory image clustering performance. Unlike existing cluster weight learning methods only considering intra-cluster similarity or cross-cluster dissimilarity, we design a novel cluster-wise weight learning strategy by jointly considering and enjoying the best of both worlds. Many carefully designed experiments on various multi-modal image datasets with big clusters and features reveal the competitive advantages of the CMIBC algorithm over lots of compared single/multi-modal clustering methods, particularly the notable improvement of 3.12% and 5.28% on the Leaves Plant Species dataset in terms of accuracy and normalized mutual information. Owing to the promising performance, the proposed CMIBC may be extended to many other practical applications, e.g., multi-modal medical analysis and video recognition.
computer science, information systems,information science & library science