SwiftThief: Enhancing Query Efficiency of Model Stealing by Contrastive Learning

SwiftThief: Enhancing Query Efficiency of Model Stealing by Contrastive Learning

Jeonghyun Lee, Sungmin Han, Sangkyun Lee

Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence
Main Track. Pages 422-430. https://doi.org/10.24963/ijcai.2024/47

Model-stealing attacks are emerging as a severe threat to AI-based services because an adversary can create models that duplicate the functionality of the black-box AI models inside the services with regular query-based access. To avoid detection or query costs, the model-stealing adversary must consider minimizing the number of queries to obtain an accurate clone model. To achieve this goal, we propose SwiftThief, a novel model-stealing framework that utilizes both queried and unqueried data to reduce query complexity. In particular, SwiftThief uses contrastive learning, a recent technique for representation learning. We formulate a new objective function for model stealing consisting of self-supervised (for abundant unqueried inputs from public datasets) and soft-supervised (for queried inputs) contrastive losses, jointly optimized with an output matching loss (for queried inputs). In addition, we suggest a new sampling strategy to prioritize rarely queried classes to improve attack performance. Our experiments proved that SwiftThief could significantly enhance the efficiency of model-stealing attacks compared to the existing methods, achieving similar attack performance using only half of the query budgets of the competing approaches. Also, SwiftThief showed high competence even when a defense was activated for the victims.
Keywords:
AI Ethics, Trust, Fairness: ETF: Safety and robustness
Multidisciplinary Topics and Applications: MTA: Security and privacy