Unbiased Active Semi-supervised Binary Classification Models

Unbiased Active Semi-supervised Binary Classification Models

JooChul Lee, Weidong Ma, Ziyang Wang

Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence
Main Track. Pages 4389-4397. https://doi.org/10.24963/ijcai.2024/485

Active learning is known to be a well-motivated algorithm that aims to maximize model performance with relatively small data, but it introduces sampling bias due to active selection. To adjust the bias, current literature utilizes corrective weights in a supervised learning approach. However, those methods consider only a small amount of actively sampled data and thus estimation efficiency can be improved using unsampled data together. In this paper, we develop an actively improved augmented estimation equation (AI-AEE) based on corrective weights as well as imputation models that allow us to leverage unlabeled data. The asymptotic distribution of the proposed estimator as the solution to the AI-AEE is derived, and an optimal sampling scheme to minimize the asymptotic mean squared error of the estimator is proposed. We then propose a general practical algorithm for training prediction models in the active and semi-supervised learning framework. The superiority of our method is demonstrated on synthetic and real data examples.
Keywords:
Machine Learning: ML: Active learning
Machine Learning: ML: Regression
Machine Learning: ML: Semi-supervised learning