Abstract
Euler Clustering / 1792
Jian-Sheng Wu, Wei-Shi Zheng, Jian-Huang Lai
By always mapping data from lower dimensional space into higher or even infinite dimensional space, kernel k-means is able to organize data into groups when data of different clusters are not linearly separable. However, kernel k-means incurs the large scale computation due to the representation theorem, i.e. keeping an extremely large kernel matrix in memory when using popular Gaussian and spatial pyramid matching kernels, which largely limits its use for processing large scale data. Also, existing kernel clustering can be over fitted by outliers as well. In this paper, we introduce an Euler clustering, which can not only maintain the benefit of nonlinear modeling using kernel function but also significantly solve the large scale computational problem in kernel-based clustering. This is realized by incorporating Euler kernel. Euler kernel is relying on a nonlinear and robust cosine metric that is less sensitive to outliers. More important it intrinsically induces an empirical map which maps data onto a complex space of the same dimension. Euler clustering takes these advantages to measure the similarity between data in a robust way without increasing the dimensionality of data, and thus solves the large scale problem in kernel k-means. We evaluate Euler clustering and show its superiority against related methods on five publicly available datasets.