Let’s Start Over: Retraining with Selective Samples for Generalized Category Discovery
Let’s Start Over: Retraining with Selective Samples for Generalized Category Discovery
Zhimao Peng, Enguang Wang, Xialei Liu, Ming-Ming Cheng
Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence
Main Track. Pages 4815-4823.
https://doi.org/10.24963/ijcai.2024/532
Generalized Category Discovery (GCD) presents a realistic
and challenging problem in open-world learning. Given a par-
tially labeled dataset, GCD aims to categorize unlabeled data
by leveraging visual knowledge from the labeled data, where
the unlabeled data includes both known and unknown classes.
Existing methods based on parametric/non-parametric classi-
fiers attempt to generate pseudo-labels/relationships for the
unlabeled data to enhance representation learning. However,
the lack of ground-truth labels for novel classes often leads
to noisy pseudo-labels/relationships, resulting in suboptimal
representation learning. This paper introduces a novel method
using Nearest Neighbor Distance-aware Label Consistency
sample selection. It creates class-consistent subsets for novel
class sample clusters from the current GCD method, acting
as “pseudo-labeled sets” to mitigate representation bias. We
propose progressive supervised representation learning with
selected samples to optimize the trade-off between quantity
and purity in each subset. Our method is versatile and appli-
cable to various GCD methods, whether parametric or non-
parametric. We conducted extensive experiments on multiple
generic and fine-grained image classification datasets to eval-
uate the effectiveness of our approach. The results demon-
strate the superiority of our method in achieving improved
performance in generalized category discovery tasks.
Keywords:
Machine Learning: ML: Clustering
Computer Vision: CV: Transfer, low-shot, semi- and un- supervised learning
Machine Learning: ML: Classification
Computer Vision: CV: Recognition (object detection, categorization)