A Survey on Dataset Distillation: Approaches, Applications and Future Directions

A Survey on Dataset Distillation: Approaches, Applications and Future Directions

Jiahui Geng, Zongxiong Chen, Yuandou Wang, Herbert Woisetschlaeger, Sonja Schimmler, Ruben Mayer, Zhiming Zhao, Chunming Rong

Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence
Survey Track. Pages 6610-6618. https://doi.org/10.24963/ijcai.2023/741

Dataset distillation is attracting more attention in machine learning as training sets continue to grow and the cost of training state-of-the-art models becomes increasingly high. By synthesizing datasets with high information density, dataset distillation offers a range of potential applications, including support for continual learning, neural architecture search, and privacy protection. Despite recent advances, we lack a holistic understanding of the approaches and applications. Our survey aims to bridge this gap by first proposing a taxonomy of dataset distillation, characterizing existing approaches, and then systematically reviewing the data modalities, and related applications. In addition, we summarize the challenges and discuss future directions for this field of research.
Keywords:
Survey: Knowledge Representation and Reasoning
Survey: Computer Vision
Survey: Machine Learning
Survey: AI Ethics, Trust, Fairness