Joint Learning Embeddings for Chinese Words and their Components via Ladder Structured Networks

Joint Learning Embeddings for Chinese Words and their Components via Ladder Structured Networks

Yan Song, Shuming Shi, Jing Li

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence
Main track. Pages 4375-4381. https://doi.org/10.24963/ijcai.2018/608

The components, such as characters and radicals, of a Chinese word are important sources to help in capturing semantic information of the word. In this paper, we propose a novel framework, namely, ladder structured networks (LSN), which contains three layers representing word, character and radical and learns their embeddings synchronously. LSN captures not only the relations among words, but also the relations among their component characters and radicals, as well as the relations across layers. Each layer in LSN is pluggable so that any particular type of unit (word, character, radical) can be removed and the LSN is thus adjusted for particular types of inputs. In evaluating our framework, we use word similarity as the intrinsic evaluation and part-of-speech tagging and document classification as extrinsic evaluations. Experimental results confirm the validity of our approach and show superiority of our approach over previous work.
Keywords:
Natural Language Processing: Natural Language Processing
Natural Language Processing: Embeddings