Fairness-Regulated Dense Subgraph Discovery

Emmanouil Kariotakis,Nikolaos Sidiropoulos,Aritra Konar
2024-12-04
Abstract:Dense subgraph discovery (DSD) is a key graph mining primitive with myriad applications including finding densely connected communities which are diverse in their vertex composition. In such a context, it is desirable to extract a dense subgraph that provides fair representation of the diverse subgroups that constitute the vertex set while incurring a small loss in terms of subgraph density. Existing methods for promoting fairness in DSD have important limitations -- the associated formulations are NP-hard in the worst case and they do not provide flexible notions of fairness, making it non-trivial to analyze the inherent trade-off between density and fairness. In this paper, we introduce two tractable formulations for fair DSD, each offering a different notion of fairness. Our methods provide a structured and flexible approach to incorporate fairness, accommodating varying fairness levels. We introduce the fairness-induced relative loss in subgraph density as a price of fairness measure to quantify the associated trade-off. We are the first to study such a notion in the context of detecting fair dense subgraphs. Extensive experiments on real-world datasets demonstrate that our methods not only match but frequently outperform existing solutions, sometimes incurring even less than half the subgraph density loss compared to prior art, while achieving the target fairness levels. Importantly, they excel in scenarios that previous methods fail to adequately handle, i.e., those with extreme subgroup imbalances, highlighting their effectiveness in extracting fair and dense solutions.
Social and Information Networks
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper aims to solve the problem of how to ensure fairness when discovering dense subgraphs (Dense Subgraph Discovery, DSD) in graph mining. Specifically, the author focuses on how to provide a fair representation of different subgroups (such as protected and non - protected groups) while ensuring the internal connection of the subgraph is tight when extracting dense subgraphs. #### Background and Problem Description 1. **Dense Subgraph Discovery (DSD)**: - DSD is a key task in graph mining, aiming to extract subgraphs with tight internal connections from a given graph. - These subgraphs are of great significance in many applications, such as social network analysis, gene annotation graph pattern detection, fraud detection in e - commerce and financial networks, etc. 2. **Fairness Challenges**: - In practical applications, the vertices of a graph may have sensitive attributes (such as gender, race, religion, political inclination, etc.), which divide the vertex set into different subgroups. - Existing DSD methods usually return highly homogeneous subgraphs, that is, the vertex attributes in the subgraph lack diversity, which may lead to some subgroups being ignored or over - represented. - Therefore, a method that can ensure the fair representation of each subgroup while maintaining the subgraph density is needed. #### Limitations of Existing Methods - **NP - hard Problem**: Existing DSD methods for promoting fairness are NP - hard in the worst - case scenario and it is difficult to find the optimal solution on large - scale datasets. - **Lack of Flexibility**: The fairness definitions provided by existing methods are not flexible enough to analyze the trade - off relationship between density and fairness. - **Hard Constraint Problem**: Some methods achieve fairness through hard constraints, which makes the problem more complex and difficult to solve. #### Main Contributions of the Paper 1. **Proposing Two New Solvable Formulas**: - The author proposes two new formulas (FADSG - I and FADSG - II) that can be solved in polynomial time for introducing fairness in DSD. - These formulas can generate dense subgraphs with different fairness levels, allowing users to flexibly choose the target fairness level. 2. **Quantifying the Cost of Fairness**: - The concept of "Price of Fairness" is introduced to quantify the loss of subgraph density when meeting the fairness requirements. - Through this indicator, the trade - off relationship between density and fairness can be systematically analyzed. 3. **Experimental Verification**: - Through extensive experiments on multiple real - world datasets, it is proved that the new method not only matches but even outperforms existing methods, and sometimes the subgraph density loss is less than half of that of existing methods. - Especially in the case of dealing with extremely unbalanced subgroups, the new method performs well and can effectively extract fair and dense subgraphs. ### Summary This paper solves the difficult problem of ensuring fairness in dense subgraph discovery by proposing new formulas that can be solved in polynomial time, and provides a method for quantifying the cost of fairness, thus providing a more flexible and effective solution for the field of graph mining.