Black-Box Fairness Testing with Shadow Models.

Weipeng Jiang,Chao Shen ,Chenhao Lin,Jingyi Wang,Jun Sun,Xuanqi Gao
DOI: https://doi.org/10.1007/978-981-99-7356-9_28
2023-01-01
Abstract:Discrimination in decision-making systems is of growing concern as machine learning techniques (especially deep learning) are increasingly applied in systems with societal impact. Multiple recent works have proposed to identify/generate discriminative samples through fairness testing. State-of-the-art fairness testing methods can efficiently generate many discriminative samples, which can be subsequently used to improve the fairness of the model. Unfortunately, the applicability of these approaches is limited in practice as they require the availability of both the model and the training data, i.e., a white-box setting. In a black-box setting (e.g., testing online services), existing approaches are impractical for multiple reasons, e.g., they require huge testing budgets. In this work, we propose a black-box fairness testing approach for neural networks, namely BREAM, which addresses two challenges, i.e., how to generate many discriminative samples without querying many times and how to guide the searching without the original model. Our overall idea is to obtain approximate gradients by training shadow models to effectively guide the discriminative sample generation for black-box DNNs. We also observe the density diversity of the distribution of discrimination, which enables incremental maintenance of shadow models and rational allocation of search resources by dividing multiple subspaces. We evaluated BREAM on three widely adopted datasets for fairness research. The results show that BREAM achieves a 9X higher performance than existing black-box methods, comparable to the state-of-the-art white-box fairness method.
What problem does this paper attempt to address?