Positive unlabeled learning via wrapper-based adaptive sampling

Positive unlabeled learning via wrapper-based adaptive sampling

Pengyi Yang, Wei Liu, Jean Yang

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence
Main track. Pages 3273-3279. https://doi.org/10.24963/ijcai.2017/457

Learning from positive and unlabeled data frequently occurs in applications where only a subset of positive instances is available while the rest of the data are unlabeled. In such scenarios, often the goal is to create a discriminant model that can accurately classify both positive and negative data by modelling from labeled and unlabeled instances. In this study, we propose an adaptive sampling (AdaSampling) approach that utilises prediction probabilities from a model to iteratively update the training data. Starting with equal prior probabilities for all unlabeled data, our method "wraps" around a predictive model to iteratively update these probabilities to distinguish positive and negative instances in unlabeled data. Subsequently, one or more robust negative set(s) can be drawn from unlabeled data, according to the likelihood of each instance being negative, to train a single classification model or ensemble of models.
Keywords:
Machine Learning: Classification
Machine Learning: Ensemble Methods
Machine Learning: Semi-Supervised Learning
Machine Learning: Cost-Sensitive Learning