Approximate Algorithms for k-Sparse Wasserstein Barycenter with Outliers

Approximate Algorithms for k-Sparse Wasserstein Barycenter with Outliers

Qingyuan Yang, Hu Ding

Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence
Main Track. Pages 5316-5325. https://doi.org/10.24963/ijcai.2024/588

Wasserstein Barycenter (WB) is one of the most fundamental optimization problems in optimal transportation. Given a set of distributions, the goal of WB is to find a new distribution that minimizes the average Wasserstein distance to them. The problem becomes even harder if we restrict the solution to be “k-sparse”. In this paper, we study the k-sparse WB problem in the presence of outliers, which is a more practical setting since real-world data often contains noise. Existing WB algorithms cannot be directly extended to handle the case with outliers, and thus it is urgently needed to develop some novel ideas. First, we investigate the relation between k-sparse WB with outliers and the clustering (with outliers) problems. In particular, we propose a clustering based LP method that yields constant approximation factor for the k-sparse WB with outliers problem. Further, we utilize the coreset technique to achieve the (1+ε)-approximation factor for any ε>0, if the dimensionality is not high. Finally, we conduct the experiments for our proposed algorithms and illustrate their efficiencies in practice.
Keywords:
Machine Learning: ML: Optimization
Data Mining: DM: Anomaly/outlier detection
Machine Learning: ML: Clustering