Abstract
Incomplete Multi-Modal Visual Data Grouping / 2392
Handong Zhao, Hongfu Liu, Yun Fu
Nowadays multi-modal visual data are much easier to access as the technology develops. Nevertheless, there is an underlying problem hidden behind the emerging multi-modality techniques: What if one/more modal data fail? Motivated by this question, we propose an unsupervised method which well handles the incomplete multi-modal data by transforming the original and incomplete data to a new and complete representation in a latent space. Different from the existing efforts that simply project data from each modality into a common subspace, a novel graph Laplacian term with a good probabilistic interpretation is proposed to couple the incomplete multi-modal samples. In such a way, a compact global structure over the entire heterogeneous data is well preserved, leading to a strong grouping discriminability. As a non-trivial contribution, we provide the optimization solution to the proposed model. In experiments, we extensively test our method and competitors on one synthetic data, two RGB-D video datasets and two image datasets. The superior results validate the benefits of the proposed method, especially when multi-modal data suffer from large incompleteness.