Robust Feature Selection on Incomplete Data

Robust Feature Selection on Incomplete Data

Wei Zheng, Xiaofeng Zhu, Yonghua Zhu, Shichao Zhang

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence
Main track. Pages 3191-3197. https://doi.org/10.24963/ijcai.2018/443

Feature selection is an indispensable preprocessing procedure for high-dimensional data analysis,but previous feature selection methods usually ignore sample diversity (i.e., every sample has individual contribution for the model construction) andhave limited ability to deal with incomplete datasets where a part of training samples have unobserved data. To address these issues, in this paper, we firstly propose a robust feature selectionframework to relieve the influence of outliers, andthen introduce an indicator matrix to avoid unobserved data to take participation in numerical computation of feature selection so that both our proposed feature selection framework and exiting feature selection frameworks are available to conductfeature selection on incomplete data sets. We further propose a new optimization algorithm to optimize the resulting objective function as well asprove our algorithm to converge fast. Experimental results on both real and artificial incompletedata sets demonstrated that our proposed methodoutperformed the feature selection methods undercomparison in terms of clustering performance.  
Keywords:
Machine Learning: Unsupervised Learning
Machine Learning: Feature Selection ; Learning Sparse Models
Machine Learning: Dimensionality Reduction and Manifold Learning