When Fairness Meets Privacy: Exploring Privacy Threats in Fair Binary Classifiers via Membership Inference Attacks

When Fairness Meets Privacy: Exploring Privacy Threats in Fair Binary Classifiers via Membership Inference Attacks

Huan Tian, Guangsheng Zhang, Bo Liu, Tianqing Zhu, Ming Ding, Wanlei Zhou

Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence
Main Track. Pages 512-520. https://doi.org/10.24963/ijcai.2024/57

While in-processing fairness approaches show promise in mitigating bias predictions, their potential impact on privacy leakage remains under-explored. We aim to address this gap by assessing the privacy risks of fairness-enhanced binary classifiers with membership inference attacks (MIAs). Surprisingly, our results reveal that these fairness interventions exhibit increased resilience against existing attacks, indicating that enhancing fairness does not necessarily lead to privacy compromises. However, we find current attack methods are ineffective as they typically degrade into simple threshold models with limited attack effectiveness. Following this observation, we discover a novel threat dubbed Fairness Discrepancy Membership Inference Attacks (FD-MIA) that exploits prediction discrepancies between fair and biased models. This attack reveals more potent vulnerabilities and poses significant privacy risks to model privacy. Extensive experiments across multiple datasets, attack methods, and representative fairness approaches confirm our findings and demonstrate the efficacy of the proposed attack method. Our study exposes the overlooked privacy threats in fairness studies, advocating for thorough evaluations of potential security vulnerabilities before model deployments.
Keywords:
AI Ethics, Trust, Fairness: ETF: Trustworthy AI
AI Ethics, Trust, Fairness: ETF: Fairness and diversity
AI Ethics, Trust, Fairness: ETF: Safety and robustness