Protein Function Prediction by Integrating Multiple Kernels / 1869
Guoxian Yu, Huzefa Rangwala, Carlotta Domeniconi, Guoji Zhang, Zili Zhang

Determining protein function constitutes an exercise in integrating information derived from several heterogeneous high-throughput experiments. To utilize the information spread across multiple sources in a combined fashion, these data sources are transformed into kernels. Several protein function prediction methods follow a two-phased approach: they first optimize the weights on individual kernels to produce a composite kernel, and then train a classifier on the composite kernel. As such, these methods result in an optimal composite kernel, but not necessarily in an optimal classifier. On the other hand, some methods optimize the loss of binary classifiers, and learn weights for the different kernels iteratively. A protein has multiple functions, and each function can be viewed as a label. These methods solve the problem of optimizing weights on the input kernels for each of the labels. This is computationally expensive and ignores inter-label correlations. In this paper, we propose a method called Protein Function Prediction by Integrating Multiple Kernels(ProMK). ProMK iteratively optimizes the phases of learning optimal weights and reducing the empirical loss of a multi-label classifier for each of the labels simultaneously, using a combined objective function. ProMK can assign larger weights to smooth kernels and downgrade the weights on noisy kernels. We evaluate the ability of ProMK to predict the function of proteins using several standard benchmarks. We show that our approach performs better than previously proposed protein function prediction approaches that integrate data from multiple networks, and multi-label multiple kernel learning methods.