Early Active Learning via Robust Representation and Structured Sparsity / 1572
Feiping Nie, Hua Wang, Heng Huang, Chris Ding

Labeling training data is quite time-consuming but essential for supervised learning models. To solve this problem, the active learning has been studied and applied to select the informative and representative data points for labeling. However, during the early stage of experiments, only a small number (or none) of labeled data points exist, thus the most representative samples should be selected first. In this paper, we propose a novel robust active learning method to handle the early stage experimental design problem and select the most representative data points. Selecting the representative samples is an NP-hard problem, thus we employ the structured sparsity-inducing norm to relax the objective to an efficient convex formulation. Meanwhile, the robust sparse representation loss function is utilized to reduce the effect of outliers. A new efficient optimization algorithm is introduced to solve our non-smooth objective with low computational cost and proved global convergence. Empirical results on both single-label and multi-label classification benchmark data sets show the promising results of our method.