Handling Noise in Boolean Matrix Factorization

Handling Noise in Boolean Matrix Factorization

Radim Belohlavek, Martin Trnecka

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence
Main track. Pages 1433-1439. https://doi.org/10.24963/ijcai.2017/198

We critically examine and point out weaknesses of the existing considerations in Boolean matrix factorization (BMF) regarding noise and the algorithms' ability to deal with noise. We argue that the current understanding is underdeveloped and that the current approaches are missing an important aspect. We provide a new, quantitative way to assess the ability of an algorithm to handle noise. Our approach is based on a common-sense definition of robustness requiring that the computed factorizations should not be affected much by varying the noise in data. We present an experimental evaluation of several existing algorithms and compare the results to the observations available in the literature. In addition to providing justification of some properties claimed in the literature without proper justification, our experiments reveal properties which were not reported as well as properties which counter certain claims made in the literature. Importantly, our approach reveals a line separating robust-to-noise from sensitive-to-noise algorithms, which has not been revealed by the previous approaches.
Keywords:
Machine Learning: Data Mining
Machine Learning: Feature Selection/Construction