Abstract
Entity Linking with Effective Acronym Expansion, Instance Selection, and Topic Modeling
Wei Zhang, Yan Chuan Sim, Jian Su, Chew Lim Tan
Entity linking maps name mentions in the documents to entries in a knowledge base through resolving the name variations and ambiguities. In this paper, we propose three advancements for entity linking. Firstly, expanding acronyms can effectively reduce the ambiguity of the acronym mentions. However, only rule-based approaches relying heavily on the presence of text markers have been used for entity linking. In this paper, we propose a supervised learning algorithm to expand more complicated acronyms encountered, which leads to 15.1% accuracy improvement over state-of-the-art acronym expansion methods. Secondly, as entity linking annotation is expensive and labor intensive, to automate the annotation process without compromise of accuracy, we propose an instance selection strategy to effectively utilize the automatically generated annotation. In our selection strategy, an informative and diverse set of instances are selected for effective disambiguation. Lastly, topic modeling is used to model the semantic topics of the articles. These advancements give statistical significant improvement to entity linking individually. Collectively they lead the highest performance on KBP-2010 task.