Learning Co-Substructures by Kernel Dependence Maximization

Learning Co-Substructures by Kernel Dependence Maximization

Sho Yokoi, Daichi Mochihashi, Ryo Takahashi, Naoaki Okazaki, Kentaro Inui

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence
Main track. Pages 3329-3335. https://doi.org/10.24963/ijcai.2017/465

Modeling associations between items in a dataset is a problem that is frequently encountered in data and knowledge mining research. Most previous studies have simply applied a predefined fixed pattern for extracting the substructure of each item pair and then analyzed the associations between these substructures. Using such fixed patterns may not, however, capture the significant association. We, therefore, propose the novel machine learning task of extracting a strongly associated substructure pair (co-substructure) from each input item pair. We call this task dependent co-substructure extraction (DCSE), and formalize it as a dependence maximization problem. Then, we discuss critical issues with this task: the data sparsity problem and a huge search space. To address the data sparsity problem, we adopt the Hilbert--Schmidt independence criterion as an objective function. To improve search efficiency, we adopt the Metropolis--Hastings algorithm. We report the results of empirical evaluations, in which the proposed method is applied for acquiring and predicting narrative event pairs, an active task in the field of natural language processing.
Keywords:
Machine Learning: Kernel Methods
Machine Learning: Unsupervised Learning
Natural Language Processing: Natural Language Processing
Natural Language Processing: Information Extraction