Mitigating the Effect of Out-of-Vocabulary Entity Pairs in Matrix Factorization for KB Inference

Mitigating the Effect of Out-of-Vocabulary Entity Pairs in Matrix Factorization for KB Inference

Prachi Jain, Shikhar Murty, Mausam, Soumen Chakrabarti

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence
Main track. Pages 4122-4129. https://doi.org/10.24963/ijcai.2018/573

This paper analyzes the varied performance of Matrix Factorization (MF) on the related tasks of relation extraction and knowledge-base completion, which have been unified recently into a single framework of knowledge-base inference (KBI) [Toutanova et al., 2015]. We first propose a new evaluation protocol that makes comparisons between MF and Tensor Factorization (TF) models fair. We find that this results in a steep drop in MF performance. Our analysis attributes this to the high out-of-vocabulary (OOV) rate of entity pairs in test folds of commonly-used datasets. To alleviate this issue, we propose three extensions to MF. Our best model is a TF-augmented MF model. This hybrid model is robust and obtains strong results across various KBI datasets.
Keywords:
Natural Language Processing: Information Extraction
Knowledge Representation and Reasoning: Non-classical Logics for Knowledge Representation
Natural Language Processing: Knowledge Extraction
Natural Language Processing: Embeddings