Measuring Statistical Dependence via the Mutual Information Dimension / 1692
Mahito Sugiyama, Karsten M. Borgwardt
We propose to measure statistical dependence between two random variables by the mutual information dimension (MID), and present a scalable parameter-free estimation method for this task. Supported by sound dimension theory, our method gives an effective solution to the problem of detecting interesting relationships of variables in massive data, which is nowadays a heavily studied topic in many scientific disciplines. Different from classical Pearson's correlation coefficient, MID is zero if and only if two random variables are statistically independent and is translation and scaling invariant. We experimentally show superior performance of MID in detecting various types of relationships in the presence of noise data. Moreover, we illustrate that MID can be effectively used for feature selection in regression.