Abstract:Understanding community structure has played an essential role in explaining network evolution, as nodes join communities which connect further to form large-scale complex networks. In real-world networks, nodes are often organized into communities based on ethnicity, gender, race, or wealth, leading to structural biases and inequalities. Community detection (CD) methods use network structure and nodes' attributes to identify communities, and can produce biased outcomes if they fail to account for structural inequalities, especially affecting minority groups. In this work, we propose group fairness metrics ($\Phi^{F*}_{p}$) to evaluate CD methods from a fairness perspective. We also conduct a comparative analysis of existing CD methods, focusing on the performance-fairness trade-off, to determine whether certain methods favor specific types of communities based on their size, density, or conductance. Our findings reveal that the trade-off varies significantly across methods, with no specific type of method consistently outperforming others. The proposed metrics and insights will help develop and evaluate fair and high performing CD methods.
What problem does this paper attempt to address?
The problems that this paper attempts to solve are as follows: In social networks, existing Community Detection (CD) methods may lead to unfair results for minority groups because they fail to fully consider structural inequalities (such as community differences based on factors like race, gender, wealth, etc.). Specifically, the author proposes and studies the following problems:
1. **Evaluating the fairness of CD methods**: Are existing CD methods biased when identifying communities with different attributes (such as size, density, conductance)? In particular, are these methods more likely to identify large communities while ignoring small communities or minority groups?
2. **Trade - off between performance and fairness**: Are high - performing CD methods necessarily fair? Conversely, do fair CD methods sacrifice performance?
3. **Lack of fairness evaluation metrics**: Currently, there are no metrics specifically designed to evaluate the fairness of CD methods. Therefore, new measurement standards need to be developed to measure the fairness of CD methods in terms of different community attributes.
To solve these problems, the author proposes a new fairness measure - **Group Fairness Metrics (\(\Phi\))**, and uses this measure to conduct a comparative analysis of existing CD methods, exploring the trade - offs between performance and fairness of different methods.
### Fairness Metric Formula
To calculate the group fairness metric \(\Phi\), the author first defines several community - level fairness metrics, including:
- **Fraction of Correctly Classified Nodes (FCCN)**:
\[
\text{FCCN}(c_i, p_j)=\frac{|c_i\cap p_j|}{|c_i|}
\]
- **F1 - score**:
\[
\text{F1}(c_i, p_j)=\frac{2|c_i\cap p_j|}{|c_i|+|p_j|}
\]
- **Fraction of Correctly Classified Edges (FCCE)**:
\[
\text{FCCE}(c_i, p_j)=\frac{|E(c_i)\cap E(p_j)|}{|E(c_i)|}
\]
Then, by fitting a least - squares line to the linear relationship between these metrics and the normalized community attributes (such as size, density, conductance), the slope is calculated as the value of the fairness metric \(\Phi\):
\[
\Phi_{F^*}^p = \frac{2}{\pi}\arctan\left(\frac{\Delta y}{\Delta x}\right)
\]
where \(\Delta x\) and \(\Delta y\) represent the changes in the horizontal and vertical coordinates of the regression line respectively, and \(\Delta x = 1\) (due to normalization).
### Conclusions
Through experiments on LFR benchmark networks and real - world networks, the author finds that:
- There are significant trade - offs between performance and fairness for different types of CD methods.
- No method can ensure both high performance and high fairness in all cases.
- The proposed group fairness metric \(\Phi\) can effectively evaluate the fairness of CD methods and provides a reference for designing more fair CD algorithms.
This research fills the gap in fairness evaluation in the field of community detection and helps promote the development of more fair social network analysis methods.