RMGN: A Regional Mask Guided Network for Parser-free Virtual Try-on

RMGN: A Regional Mask Guided Network for Parser-free Virtual Try-on

Chao Lin, Zhao Li, Sheng Zhou, Shichang Hu, Jialun Zhang, Linhao Luo, Jiarun Zhang, Longtao Huang, Yuan He

Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence
Main Track. Pages 1151-1158. https://doi.org/10.24963/ijcai.2022/161

Virtual try-on (VTON) aims at fitting target clothes to reference person images, which is widely adopted in e-commerce. Existing VTON approaches can be narrowly categorized into Parser-Based (PB) and Parser-Free (PF) by whether relying on the parser information to mask the persons’clothes and synthesize try-on images. Although abandoning parser information has improved the applicability of PF methods, the ability of detail synthesizing has also been sacrificed. As a result, the distraction from original cloth may persist in synthesized images, especially in complicated postures and high resolution applications. To address the aforementioned issue, we propose a novel PF method named Regional Mask Guided Network (RMGN). More specifically, a regional mask is proposed to explicitly fuse the features of target clothes and reference persons so that the persisted distraction can be eliminated. A posture awareness loss and a multi-level feature extractor are further proposed to handle the complicated postures and synthesize high resolution images. Extensive experiments demonstrate that our proposed RMGN outperforms both state-of-the-art PB and PF methods. Ablation studies further verify the effectiveness of modules in RMGN. Code is available at https://github.com/jokerlc/RMGN-VITON.
Keywords:
Computer Vision: Applications
Computer Vision: Neural generative models, auto encoders, GANs  
Computer Vision: Segmentation