Abstract

Word Sense Disambiguation through Sememe Labeling

Xiangyu Duan, Jun Zhao, Bo Xu

Currently most word sense disambiguation (WSD) systems are relatively individual word sense experts. Scarcely do these systems take word sense transitions between senses of linearly consecutive words or syntactically dependent words into consideration. Word sense transitions are very important. They embody the fluency of semantic expression and avoid sparse data problem effectively. In this paper, HowNet knowledge base is used to decompose every word sense into several sememes. Then one transition between two words' senses becomes multiple transitions between sememes. Sememe transitions are much easier to be captured than word sense transitions due to much less sememes. When sememes are labeled, WSD is done. In this paper, multi-layered conditional random fields (MLCRF) is proposed to model sememe transitions. The experiments show that MLCRF performs better than a base-line system and a maximum entropy model. Syntactic and hypernym features can enhance the performance significantly.