Abstract:Dense subgraph discovery (DSD) is a key graph mining primitive with myriad applications including finding densely connected communities which are diverse in their vertex composition. In such a context, it is desirable to extract a dense subgraph that provides fair representation of the diverse subgroups that constitute the vertex set while incurring a small loss in terms of subgraph density. Existing methods for promoting fairness in DSD have important limitations -- the associated formulations are NP-hard in the worst case and they do not provide flexible notions of fairness, making it non-trivial to analyze the inherent trade-off between density and fairness. In this paper, we introduce two tractable formulations for fair DSD, each offering a different notion of fairness. Our methods provide a structured and flexible approach to incorporate fairness, accommodating varying fairness levels. We introduce the fairness-induced relative loss in subgraph density as a price of fairness measure to quantify the associated trade-off. We are the first to study such a notion in the context of detecting fair dense subgraphs. Extensive experiments on real-world datasets demonstrate that our methods not only match but frequently outperform existing solutions, sometimes incurring even less than half the subgraph density loss compared to prior art, while achieving the target fairness levels. Importantly, they excel in scenarios that previous methods fail to adequately handle, i.e., those with extreme subgroup imbalances, highlighting their effectiveness in extracting fair and dense solutions.

What problem does this paper attempt to address?

### What problem does this paper attempt to solve? This paper aims to solve the problem of how to ensure fairness when discovering dense subgraphs (Dense Subgraph Discovery, DSD) in graph mining. Specifically, the author focuses on how to provide a fair representation of different subgroups (such as protected and non - protected groups) while ensuring the internal connection of the subgraph is tight when extracting dense subgraphs. #### Background and Problem Description 1. **Dense Subgraph Discovery (DSD)**: - DSD is a key task in graph mining, aiming to extract subgraphs with tight internal connections from a given graph. - These subgraphs are of great significance in many applications, such as social network analysis, gene annotation graph pattern detection, fraud detection in e - commerce and financial networks, etc. 2. **Fairness Challenges**: - In practical applications, the vertices of a graph may have sensitive attributes (such as gender, race, religion, political inclination, etc.), which divide the vertex set into different subgroups. - Existing DSD methods usually return highly homogeneous subgraphs, that is, the vertex attributes in the subgraph lack diversity, which may lead to some subgroups being ignored or over - represented. - Therefore, a method that can ensure the fair representation of each subgroup while maintaining the subgraph density is needed. #### Limitations of Existing Methods - **NP - hard Problem**: Existing DSD methods for promoting fairness are NP - hard in the worst - case scenario and it is difficult to find the optimal solution on large - scale datasets. - **Lack of Flexibility**: The fairness definitions provided by existing methods are not flexible enough to analyze the trade - off relationship between density and fairness. - **Hard Constraint Problem**: Some methods achieve fairness through hard constraints, which makes the problem more complex and difficult to solve. #### Main Contributions of the Paper 1. **Proposing Two New Solvable Formulas**: - The author proposes two new formulas (FADSG - I and FADSG - II) that can be solved in polynomial time for introducing fairness in DSD. - These formulas can generate dense subgraphs with different fairness levels, allowing users to flexibly choose the target fairness level. 2. **Quantifying the Cost of Fairness**: - The concept of "Price of Fairness" is introduced to quantify the loss of subgraph density when meeting the fairness requirements. - Through this indicator, the trade - off relationship between density and fairness can be systematically analyzed. 3. **Experimental Verification**: - Through extensive experiments on multiple real - world datasets, it is proved that the new method not only matches but even outperforms existing methods, and sometimes the subgraph density loss is less than half of that of existing methods. - Especially in the case of dealing with extremely unbalanced subgroups, the new method performs well and can effectively extract fair and dense subgraphs. ### Summary This paper solves the difficult problem of ensuring fairness in dense subgraph discovery by proposing new formulas that can be solved in polynomial time, and provides a method for quantifying the cost of fairness, thus providing a more flexible and effective solution for the field of graph mining.

Fairness-Regulated Dense Subgraph Discovery

In-depth Analysis of Densest Subgraph Discovery in a Unified Framework

Efficient Algorithms for Densest Subgraph Discovery

Mining Density Contrast Subgraphs

Densest Diverse Subgraphs: How to Plan a Successful Cocktail Party with Diversity

Finding a Dense Subgraph with Sparse Cut

In Search of Dense Subgraphs: How Good is Greedy Peeling?

Robust Densest Subgraph Discovery

Principal Fairness: Removing Bias via Projections

Efficient and effective algorithms for densest subgraph discovery and maintenance

A Semidefinite Relaxation Approach for Fair Graph Clustering

Online Dense Subgraph Discovery via Blurred-Graph Feedback

Explainable Subgraphs with Surprising Densities: A Subgroup Discovery Approach

Almost Tight Bounds for Differentially Private Densest Subgraph

Jaccard-constrained dense subgraph discovery

Fairness in Graph Mining: A Survey

Finding Densest $k$-Connected Subgraphs

Optimal Quasi-clique: Hardness, Equivalence with Densest-k-Subgraph, and Quasi-partitioned Community Mining

Local Density and its Distributed Approximation

Dense Subgraph Extraction with Application to Community Detection

On Densest $k$-Subgraph Mining and Diagonal Loading