Semi-supervised Learning over Heterogeneous Information Networks by Ensemble of Meta-graph Guided Random Walks

Semi-supervised Learning over Heterogeneous Information Networks by Ensemble of Meta-graph Guided Random Walks

He Jiang, Yangqiu Song, Chenguang Wang, Ming Zhang, Yizhou Sun

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence
Main track. Pages 1944-1950. https://doi.org/10.24963/ijcai.2017/270

Heterogeneous information networks (HINs) is a general representation of many real world applications. The difference between HIN and traditional homogeneous graphs is that the nodes and edges in HIN are with types. Then in the many applications, we need to consider the types to make the approach more semantically meaningful. For the applications that annotation is expensive, on natural way is to consider semi-supervised learning over HIN. In this paper, we present a semi-supervised learning algorithm constrained by the types of HINs. We first decompose the original HIN into several semantically meaningful sub-graphs based the meta-graphs composed of entity and relation types. Then we perform random walk over the sub-graphs to propagate the labels from labeled data to unlabeled data. After we obtain all the labels propagated by different trials of random walk guided by meta-graphs, we use an ensemble algorithm to vote for the final labeling results. We use two public available datasets, 20-newsgroups and RCV1 datasets to test our algorithm. Experimental results show that our algorithm is better than the traditional semi-supervised learning algorithms for HINs. One particular by-product of this work is that we show that previous random walk approach guided by meta-paths can be non-stationary, which is the major reason we propose a meta-graph guide random walk for semi-supervised learning over HINs.
Keywords:
Machine Learning: Semi-Supervised Learning
Natural Language Processing: Text Classification