3D Surface Super-resolution from Enhanced 2D Normal Images: A Multimodal-driven Variational AutoEncoder Approach

3D Surface Super-resolution from Enhanced 2D Normal Images: A Multimodal-driven Variational AutoEncoder Approach

Wuyuan Xie, Tengcong Huang, Miaohui Wang

Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence
Main Track. Pages 1578-1586. https://doi.org/10.24963/ijcai.2023/175

3D surface super-resolution is an important technical tool in virtual reality, and it is also a research hotspot in computer vision. Due to the unstructured and irregular nature of 3D object data, it is usually difficult to obtain high-quality surface details and geometry textures via a low-cost hardware setup. In this paper, we establish a multimodal-driven variational autoencoder (mmVAE) framework to perform 3D surface enhancement based on 2D normal images. To fully leverage the multimodal learning, we investigate a multimodal Gaussian mixture model (mmGMM) to align and fuse the latent feature representations from different modalities, and further propose a cross-scale encoder-decoder structure to reconstruct high-resolution normal images. Experimental results on several benchmark datasets demonstrate that our method delivers promising surface geometry structures and details in comparison with competitive advances.
Keywords:
Computer Vision: CV: Applications
Computer Vision: CV: 3D computer vision