MGTCOM: Community Detection in Multimodal Graphs

E. Dmitriev,M. W. Chekol,S. Wang
DOI: https://doi.org/10.48550/arXiv.2211.06331
2022-11-11
Abstract:Community detection is the task of discovering groups of nodes sharing similar patterns within a network. With recent advancements in deep learning, methods utilizing graph representation learning and deep clustering have shown great results in community detection. However, these methods often rely on the topology of networks (i) ignoring important features such as network heterogeneity, temporality, multimodality, and other possibly relevant features. Besides, (ii) the number of communities is not known a priori and is often left to model selection. In addition, (iii) in multimodal networks all nodes are assumed to be symmetrical in their features; while true for homogeneous networks, most of the real-world networks are heterogeneous where feature availability often varies. In this paper, we propose a novel framework (named MGTCOM) that overcomes the above challenges (i)--(iii). MGTCOM identifies communities through multimodal feature learning by leveraging a new sampling technique for unsupervised learning of temporal embeddings. Importantly, MGTCOM is an end-to-end framework optimizing network embeddings, communities, and the number of communities in tandem. In order to assess its performance, we carried out an extensive evaluation on a number of multimodal networks. We found out that our method is competitive against state-of-the-art and performs well in inductive inference.
Social and Information Networks,Machine Learning
What problem does this paper attempt to address?
This paper attempts to solve several key problems encountered in community detection in multimodal networks: 1. **Ignoring the heterogeneity, temporality, and multimodal characteristics of the network**: Existing methods often only focus on the topological structure of the network, ignoring the heterogeneity of the network (i.e., different types of nodes and edges), temporality (i.e., the change of the network over time), and multimodal characteristics (i.e., nodes may have multiple types of features, such as text, image, etc.). 2. **Unknown number of communities**: In many practical applications, the number of communities is not known in advance, which requires the model to be able to automatically infer the appropriate number of communities. 3. **Asymmetry of node features**: In multimodal networks, the features of different nodes may not be completely symmetrical, that is, some nodes may have rich features, while other nodes may have fewer features or no features at all. Existing methods usually assume that the features of all nodes are symmetrical, which is not always true in actual networks. To overcome the above challenges, the paper proposes a new framework **MGTCOM**, which identifies communities through multimodal feature learning and unsupervised learning techniques. Specifically, the main contributions of MGTCOM include: - **A robust unsupervised inductive representation learning method on multimodal networks**. - **A new time - embedded unsupervised learning sampling technique (called "ballroom walk")**. - **An end - to - end framework that simultaneously optimizes network embedding, communities, and the number of communities**. - **A comprehensive evaluation of various feature qualities in multimodal networks**. - **A comparison with existing state - of - the - art methods, demonstrating the robustness of MGTCOM in inference tasks**. Through these contributions, MGTCOM can provide more comprehensive and accurate community detection results when dealing with multimodal networks.