Learning General Gaussian Mixture Model with Integral Cosine Similarity

Guanglin Li; Bin Li; Changsheng Chen; Shunquan Tan; Guoping Qiu

doi:10.24963/ijcai.2022/444

Learning General Gaussian Mixture Model with Integral Cosine Similarity

Guanglin Li, Bin Li, Changsheng Chen, Shunquan Tan, Guoping Qiu

Watch video

Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence

Main Track. Pages 3201-3207. https://doi.org/10.24963/ijcai.2022/444

PDF BibTeX

Gaussian mixture model (GMM) is a powerful statistical tool in data modeling, especially for unsupervised learning tasks. Traditional learning methods for GMM such as expectation maximization (EM) require the covariance of the Gaussian components to be non-singular, a condition that is often not satisfied in real-world applications. This paper presents a new learning method called G$^2$M$^2$ (General Gaussian Mixture Model) by fitting an unnormalized Gaussian mixture function (UGMF) to a data distribution. At the core of G$^2$M$^2$ is the introduction of an integral cosine similarity (ICS) function for comparing the UGMF and the unknown data density distribution without having to explicitly estimate it. By maximizing the ICS through Monte Carlo sampling, the UGMF can be made to overlap with the unknown data density distribution such that the two only differ by a constant scalar, and the UGMF can be normalized to obtain the data density distribution. A Siamese convolutional neural network is also designed for optimizing the ICS function. Experimental results show that our method is more competitive in modeling data having correlations that may lead to singular covariance matrices in GMM, and it outperforms state-of-the-art methods in unsupervised anomaly detection.

Keywords:

Machine Learning: Unsupervised Learning

Data Mining: Anomaly/Outlier Detection