Strategic Adversarial Attacks in AI-assisted Decision Making to Reduce Human Trust and Reliance

Strategic Adversarial Attacks in AI-assisted Decision Making to Reduce Human Trust and Reliance

Zhuoran Lu, Zhuoyan Li, Chun-Wei Chiang, Ming Yin

Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence
Main Track. Pages 3020-3028. https://doi.org/10.24963/ijcai.2023/337

With the increased integration of AI technologies in human decision making processes, adversarial attacks on AI models become a greater concern than ever before as they may significantly hurt humans’ trust in AI models and decrease the effectiveness of human-AI collaboration. While many adversarial attack methods have been proposed to decrease the performance of an AI model, limited attention has been paid on understanding how these attacks will impact the human decision makers interacting with the model, and accordingly, how to strategically deploy adversarial attacks to maximize the reduction of human trust and reliance. In this paper, through a human-subject experiment, we first show that in AI-assisted decision making, the timing of the attacks largely influences how much humans decrease their trust in and reliance on AI—the decrease is particularly salient when attacks occur on decision making tasks that humans are highly confident themselves. Based on these insights, we next propose an algorithmic framework to infer the human decision maker’s hidden trust in the AI model and dynamically decide when the attacker should launch an attack to the model. Our evaluations show that following the proposed approach, attackers deploy more efficient attacks and achieve higher utility than adopting other baseline strategies.
Keywords:
Humans and AI: HAI: Human-AI collaboration
Humans and AI: HAI: Applications
Humans and AI: HAI: Human-computer interaction