A Unified Framework for Fair Spectral Clustering With Effective Graph Learning

Xiang Zhang,Qiao Wang
2023-11-23
Abstract:We consider the problem of spectral clustering under group fairness constraints, where samples from each sensitive group are approximately proportionally represented in each cluster. Traditional fair spectral clustering (FSC) methods consist of two consecutive stages, i.e., performing fair spectral embedding on a given graph and conducting $k$means to obtain discrete cluster labels. However, in practice, the graph is usually unknown, and we need to construct the underlying graph from potentially noisy data, the quality of which inevitably affects subsequent fair clustering performance. Furthermore, performing FSC through separate steps breaks the connections among these steps, leading to suboptimal results. To this end, we first theoretically analyze the effect of the constructed graph on FSC. Motivated by the analysis, we propose a novel graph construction method with a node-adaptive graph filter to learn graphs from noisy data. Then, all independent stages of conventional FSC are integrated into a single objective function, forming an end-to-end framework that inputs raw data and outputs discrete cluster labels. An algorithm is developed to jointly and alternately update the variables in each stage. Finally, we conduct extensive experiments on synthetic, benchmark, and real data, which show that our model is superior to state-of-the-art fair clustering methods.
Machine Learning,Computers and Society
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to perform spectral clustering (SC) under group - fairness constraints. Specifically, the paper focuses on how to achieve fair spectral clustering when the samples of each sensitive group are approximately proportionally represented in each cluster. Traditional Fair Spectral Clustering (FSC) methods are usually divided into two consecutive stages: first, perform fair spectral embedding on a given graph, and then obtain discrete clustering labels through k - means. However, in practical applications, the graph is usually unknown and needs to be constructed from data that may contain noise, and the quality of the graph will inevitably affect the subsequent fair clustering performance. In addition, performing FSC in steps will break the connection between these steps, resulting in sub - optimal results. To solve these problems, the paper has carried out the following work: 1. **Theoretical analysis**: First, a theoretical analysis of the impact of the constructed graph on FSC was carried out, proving the importance of an accurate graph for improving fair clustering performance. 2. **Graph construction method**: Based on the theoretical analysis, a new graph construction method was proposed, which uses node - adaptive graph filters to learn graphs from noisy data. 3. **End - to - end framework**: Integrate the independent stages in traditional FSC into a single objective function to form an end - to - end framework that takes the original data as input and outputs discrete clustering labels. 4. **Algorithm development**: Developed an algorithm that jointly and alternately updates the variables in each stage to achieve the overall optimal solution. 5. **Experimental verification**: Extensive experiments were carried out on synthetic data, benchmark data, and real - data, and the results show that this model is superior to existing fair clustering methods. Through these works, the paper aims to provide a more effective and fairer spectral clustering method, especially when dealing with complex high - dimensional data sets.