EAB-FL: Exacerbating Algorithmic Bias through Model Poisoning Attacks in Federated Learning

Syed Irfan Ali Meerza; Jian Liu

doi:10.24963/ijcai.2024/51

EAB-FL: Exacerbating Algorithmic Bias through Model Poisoning Attacks in Federated Learning

Syed Irfan Ali Meerza, Jian Liu

Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence

Main Track. Pages 458-466. https://doi.org/10.24963/ijcai.2024/51

PDF BibTeX

Federated Learning (FL) is a technique that allows multiple parties to train a shared model collaboratively without disclosing their private data. It has become increasingly popular due to its distinct privacy advantages. However, FL models can suffer from biases against certain demographic groups (e.g., racial and gender groups) due to the heterogeneity of data and party selection. Researchers have proposed various strategies for characterizing the group fairness of FL algorithms to address this issue. However, the effectiveness of these strategies in the face of deliberate adversarial attacks has not been fully explored. Although existing studies have revealed various threats (e.g., model poisoning attacks) against FL systems caused by malicious participants, their primary aim is to decrease model accuracy, while the potential of leveraging poisonous model updates to exacerbate model unfairness remains unexplored. In this paper, we propose a new type of model poisoning attack, EAB-FL, with a focus on exacerbating group unfairness while maintaining a good level of model utility. Extensive experiments on three datasets demonstrate the effectiveness and efficiency of our attack, even with state-of-the-art fairness optimization algorithms and secure aggregation rules employed. We hope this work will help the community fully understand the attack surfaces of current FL systems and facilitate corresponding mitigation to improve their resilience.

Keywords:

AI Ethics, Trust, Fairness: ETF: Fairness and diversity

AI Ethics, Trust, Fairness: ETF: Bias

Machine Learning: ML: Adversarial machine learning