Vision-fused Attack: Advancing Aggressive and Stealthy Adversarial Text against Neural Machine Translation

Yanni Xue; Haojie Hao; Jiakai Wang; Qiang Sheng; Renshuai Tao; Yu Liang; Pu Feng; Xianglong Liu

doi:10.24963/ijcai.2024/730

Vision-fused Attack: Advancing Aggressive and Stealthy Adversarial Text against Neural Machine Translation

Yanni Xue, Haojie Hao, Jiakai Wang, Qiang Sheng, Renshuai Tao, Yu Liang, Pu Feng, Xianglong Liu

Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence

Main Track. Pages 6606-6614. https://doi.org/10.24963/ijcai.2024/730

PDF BibTeX

While neural machine translation (NMT) models achieve success in our daily lives, they show vulnerability to adversarial attacks. Despite being harmful, these attacks also offer benefits for interpreting and enhancing NMT models, thus drawing increased research attention. However, existing studies on adversarial attacks are insufficient in both attacking ability and human imperceptibility due to their sole focus on the scope of language. This paper proposes a novel vision-fused attack (VFA) framework to acquire powerful adversarial text, i.e., more aggressive and stealthy. Regarding the attacking ability, we design the vision-merged solution space enhancement strategy to enlarge the limited semantic solution space, which enables us to search for adversarial candidates with higher attacking ability. For human imperceptibility, we propose the perception-retained adversarial text selection strategy to align the human text-reading mechanism. Thus, the finally selected adversarial text could be more deceptive. Extensive experiments on various models, including large language models (LLMs) like LLaMA and GPT-3.5, strongly support that VFA outperforms the comparisons by large margins (up to 81%/14% improvements on ASR/SSIM).

Keywords:

Natural Language Processing: NLP: Machine translation and multilinguality

AI Ethics, Trust, Fairness: ETF: Safety and robustness

Machine Learning: ML: Trustworthy machine learning