Pluggable Watermarking of Deepfake Models for Deepfake Detection

Pluggable Watermarking of Deepfake Models for Deepfake Detection

Han Bao, Xuhong Zhang, Qinying Wang, Kangming Liang, Zonghui Wang, Shouling Ji, Wenzhi Chen

Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence
Main Track. Pages 331-339. https://doi.org/10.24963/ijcai.2024/37

Deepfake model misuse poses major security concerns. Existing passive and active Deepfake detection methods both suffer from a lack of generalizability and robustness. In this study, we propose a pluggable and efficient active model watermarking framework for Deepfake detection. This approach facilitates the embedding of identification watermarks across a variety of Deepfake generation models, enabling easy extraction by authorities for detection purposes. Specifically, our method leverages the universal convolutional structure in generative model decoders. It employs convolutional kernel sparsification for adaptive watermark embedding positioning and introduces convolutional kernel normalization to seamlessly integrate watermark parameters with those of the generative model. For watermark extraction, we jointly train a watermark extractor based on a Deepfake detection model and use BCH encoding to identify watermark images effectively. Finally, we apply our approach to eight major types of Deepfake generation models. Experiments show our method successfully detects Deepfakes with an average accuracy exceeding 94% even in heavy lossy channels. This approach operates independently of the generation model's training without affecting the original model's performance. Furthermore, our model requires training a very limited number of parameters, and it is resilient against three major adaptive attacks. The source code can be found at https://github.com/GuaiZao/Pluggable-Watermarking
Keywords:
AI Ethics, Trust, Fairness: ETF: Trustworthy AI
AI Ethics, Trust, Fairness: ETF: Safety and robustness