Cost-Effective Active Learning for Hierarchical Multi-Label Classification

Cost-Effective Active Learning for Hierarchical Multi-Label Classification

Yi-Fan Yan, Sheng-Jun Huang

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence
Main track. Pages 2962-2968. https://doi.org/10.24963/ijcai.2018/411

Active learning reduces the labeling cost by actively querying labels for the most valuable data. It is particularly important for multi-label learning, where the annotation cost is rather high because each instance may have multiple labels simultaneously. In many multi-label tasks, the labels are organized into hierarchies from coarse to fine. The labels at different levels of the hierarchy contribute differently to the model training, and also have diverse annotation costs. In this paper, we propose a multi-label active learning approach to exploit the label hierarchies for cost-effective queries. By incorporating the potential contribution of ancestor and descendant labels, a novel criterion is proposed to estimate the informativeness of each candidate query. Further, a subset selection method is introduced to perform active batch selection by balancing the informativeness and cost of each instance-label pair. Experimental results validate the effectiveness of both the proposed criterion and the selection method.
Keywords:
Machine Learning: Active Learning
Machine Learning: Machine Learning
Machine Learning: Semi-Supervised Learning
Machine Learning: Multi-instance;Multi-label;Multi-view learning