Semi-supervised Max-margin Topic Model with Manifold Posterior Regularization

Semi-supervised Max-margin Topic Model with Manifold Posterior Regularization

Wenbo Hu, Jun Zhu, Hang Su, Jingwei Zhuo, Bo Zhang

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence
Main track. Pages 1865-1871. https://doi.org/10.24963/ijcai.2017/259

Supervised topic models leverage label information to learn discriminative latent topic representations. As collecting a fully labeled dataset is often time-consuming, semi-supervised learning is of high interest. In this paper, we present an effective semi-supervised max-margin topic model by naturally introducing manifold posterior regularization to a regularized Bayesian topic model, named LapMedLDA. The model jointly learns latent topics and a related classifier with only a small fraction of labeled documents. To perform the approximate inference, we derive an efficient stochastic gradient MCMC method. Unlike the previous semi-supervised topic models, our model adopts a tight coupling between the generative topic model and the discriminative classifier. Extensive experiments demonstrate that such tight coupling brings significant benefits in quantitative and qualitative performance.
Keywords:
Machine Learning: Semi-Supervised Learning
Natural Language Processing: Natural Language Processing
Natural Language Processing: Text Classification