Evaluating Overfit and Underfit in Models of Network Community Structure

Amir Ghasemian,Homa Hosseinmardi,Aaron Clauset
DOI: https://doi.org/10.1109/TKDE.2019.2911585
2019-04-17
Abstract:A common data mining task on networks is community detection, which seeks an unsupervised decomposition of a network into structural groups based on statistical regularities in the network's connectivity. Although many methods exist, the No Free Lunch theorem for community detection implies that each makes some kind of tradeoff, and no algorithm can be optimal on all inputs. Thus, different algorithms will over or underfit on different inputs, finding more, fewer, or just different communities than is optimal, and evaluation methods that use a metadata partition as a ground truth will produce misleading conclusions about general accuracy. Here, we present a broad evaluation of over and underfitting in community detection, comparing the behavior of 16 state-of-the-art community detection algorithms on a novel and structurally diverse corpus of 406 real-world networks. We find that (i) algorithms vary widely both in the number of communities they find and in their corresponding composition, given the same input, (ii) algorithms can be clustered into distinct high-level groups based on similarities of their outputs on real-world networks, and (iii) these differences induce wide variation in accuracy on link prediction and link description tasks. We introduce a new diagnostic for evaluating overfitting and underfitting in practice, and use it to roughly divide community detection methods into general and specialized learning algorithms. Across methods and inputs, Bayesian techniques based on the stochastic block model and a minimum description length approach to regularization represent the best general learning approach, but can be outperformed under specific circumstances. These results introduce both a theoretically principled approach to evaluate over and underfitting in models of network community structure and a realistic benchmark by which new methods may be evaluated and compared.
Machine Learning,Social and Information Networks,Data Analysis, Statistics and Probability,Molecular Networks
What problem does this paper attempt to address?
The problem that this paper attempts to solve is that in the detection of network community structure, different algorithms have the problems of overfitting and underfitting in practical applications. Specifically: 1. **Differences in algorithm performance**: When different community detection algorithms process the same input data, the number and composition of the communities found are very different, which indicates that there are significant differences in the adaptability of different algorithms to data. 2. **Lack of evaluation methods**: Currently, there is a lack of systematic methods to evaluate the overfitting and underfitting of different algorithms on actual network data, which leads to a lack of guidance in choosing appropriate algorithms. 3. **Gap between theory and practice**: Although there are some theorems on the consistency of community detection in theory, these theorems are mainly applicable to dense networks, while most actual networks are sparse. Therefore, the effectiveness of these theoretical results in practical applications needs to be verified. 4. **The "No Free Lunch" theorem**: The "No Free Lunch" (NFL) theorem in community detection points out that no algorithm can perform optimally on all types of networks. This means that each algorithm needs to perform better on some types of data and worse on other types of data. 5. **The "No Ground Truth" theorem**: Another important issue is the "No Ground Truth" theorem, that is, there is no unique and correct community division standard. This makes the traditional evaluation methods based on node labels have limitations. To solve these problems, the author conducted an extensive study, compared the performance of 16 state - of - the - art community detection algorithms on 572 real - world networks, and evaluated the overfitting and underfitting tendencies of the algorithms through link prediction and link description tasks. Through this method, the author hopes to provide a more reliable way to evaluate and compare the performance of different community detection algorithms.