Positive and Unlabeled Learning with Label Disambiguation

Chuang Zhang; Dexin Ren; Tongliang Liu; Jian Yang; Chen Gong

Positive and Unlabeled Learning with Label Disambiguation

Chuang Zhang, Dexin Ren, Tongliang Liu, Jian Yang, Chen Gong

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence

Main track. Pages 4250-4256. https://doi.org/10.24963/ijcai.2019/590

PDF BibTeX

Positive and Unlabeled (PU) learning aims to learn a binary classifier from only positive and unlabeled training data. The state-of-the-art methods usually formulate PU learning as a cost-sensitive learning problem, in which every unlabeled example is simultaneously treated as positive and negative with different class weights. However, the ground-truth label of an unlabeled example should be unique, so the existing models inadvertently introduce the label noise which may lead to the biased classifier and deteriorated performance. To solve this problem, this paper proposes a novel algorithm dubbed as "Positive and Unlabeled learning with Label Disambiguation'' (PULD). We first regard all the unlabeled examples in PU learning as ambiguously labeled as positive and negative, and then employ the margin-based label disambiguation strategy, which enlarges the margin of classifier response between the most likely label and the less likely one, to find the unique ground-truth label of each unlabeled example. Theoretically, we derive the generalization error bound of the proposed method by analyzing its Rademacher complexity. Experimentally, we conduct intensive experiments on both benchmark and real-world datasets, and the results clearly demonstrate the superiority of the proposed PULD to the existing PU learning approaches.

Keywords:

Machine Learning: Classification

Machine Learning: Semi-Supervised Learning