Abstract:A common data mining task on networks is community detection, which seeks an unsupervised decomposition of a network into structural groups based on statistical regularities in the network's connectivity. Although many methods exist, the No Free Lunch theorem for community detection implies that each makes some kind of tradeoff, and no algorithm can be optimal on all inputs. Thus, different algorithms will over or underfit on different inputs, finding more, fewer, or just different communities than is optimal, and evaluation methods that use a metadata partition as a ground truth will produce misleading conclusions about general accuracy. Here, we present a broad evaluation of over and underfitting in community detection, comparing the behavior of 16 state-of-the-art community detection algorithms on a novel and structurally diverse corpus of 406 real-world networks. We find that (i) algorithms vary widely both in the number of communities they find and in their corresponding composition, given the same input, (ii) algorithms can be clustered into distinct high-level groups based on similarities of their outputs on real-world networks, and (iii) these differences induce wide variation in accuracy on link prediction and link description tasks. We introduce a new diagnostic for evaluating overfitting and underfitting in practice, and use it to roughly divide community detection methods into general and specialized learning algorithms. Across methods and inputs, Bayesian techniques based on the stochastic block model and a minimum description length approach to regularization represent the best general learning approach, but can be outperformed under specific circumstances. These results introduce both a theoretically principled approach to evaluate over and underfitting in models of network community structure and a realistic benchmark by which new methods may be evaluated and compared.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is that in the detection of network community structure, different algorithms have the problems of overfitting and underfitting in practical applications. Specifically: 1. **Differences in algorithm performance**: When different community detection algorithms process the same input data, the number and composition of the communities found are very different, which indicates that there are significant differences in the adaptability of different algorithms to data. 2. **Lack of evaluation methods**: Currently, there is a lack of systematic methods to evaluate the overfitting and underfitting of different algorithms on actual network data, which leads to a lack of guidance in choosing appropriate algorithms. 3. **Gap between theory and practice**: Although there are some theorems on the consistency of community detection in theory, these theorems are mainly applicable to dense networks, while most actual networks are sparse. Therefore, the effectiveness of these theoretical results in practical applications needs to be verified. 4. **The "No Free Lunch" theorem**: The "No Free Lunch" (NFL) theorem in community detection points out that no algorithm can perform optimally on all types of networks. This means that each algorithm needs to perform better on some types of data and worse on other types of data. 5. **The "No Ground Truth" theorem**: Another important issue is the "No Ground Truth" theorem, that is, there is no unique and correct community division standard. This makes the traditional evaluation methods based on node labels have limitations. To solve these problems, the author conducted an extensive study, compared the performance of 16 state - of - the - art community detection algorithms on 572 real - world networks, and evaluated the overfitting and underfitting tendencies of the algorithms through link prediction and link description tasks. Through this method, the author hopes to provide a more reliable way to evaluate and compare the performance of different community detection algorithms.

Evaluating Overfit and Underfit in Models of Network Community Structure

Community detection algorithm evaluation with ground-truth data

Community structure: A comparative evaluation of community detection methods

Comparative Evaluation of Community Detection Algorithms: A Topological Approach

Detecting Overlapping Community Structures In Networks With Global Partition And Local Expansion

The ground truth about metadata and community detection in networks

Sifting out communities in large sparse networks

Community Detection in Social Networks: An In-depth Benchmarking Study with a Procedure-Oriented Framework

Community Detection Algorithm Evaluation using Size and Hashtags

Implicit models, latent compression, intrinsic biases, and cheap lunches in community detection

CID Models on Real-world Social Networks and Goodness of Fit Measurements

Community detection in large‐scale networks: a survey and empirical evaluation

Defining and Evaluating Network Communities based on Ground-truth

Towards realistic artificial benchmark for community detection algorithms evaluation

<i>LookCom</i>: Learning Optimal Network for Community Detection

Beyond Asymptotics: Practical Insights into Community Detection in Complex Networks

Community Detection through Likelihood Optimization: In Search of a Sound Model

Discovering Natural Communities in Networks

A Survey on Theoretical Advances of Community Detection in Networks

Well-Connected Communities in Real-World and Synthetic Networks

Overview of Community Detection Models on Statistical Inference