Guided Attention Network for Concept Extraction

Guided Attention Network for Concept Extraction

Songtao Fang, Zhenya Huang, Ming He, Shiwei Tong, Xiaoqing Huang, Ye Liu, Jie Huang, Qi Liu

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence
Main Track. Pages 1449-1455. https://doi.org/10.24963/ijcai.2021/200

Concept extraction aims to find words or phrases describing a concept from massive texts. Recently, researchers propose many neural network-based methods to automatically extract concepts. Although these methods for this task show promising results, they ignore structured information in the raw textual data (e.g., title, topic, and clue words). In this paper, we propose a novel model, named Guided Attention Concept Extraction Network (GACEN), which uses title, topic, and clue words as additional supervision to provide guidance directly. Specifically, GACEN comprises two attention networks, one of them is to gather the relevant title and topic information for each context word in the document. The other one aims to model the implicit connection between informative words (clue words) and concepts. Finally, we aggregate information from two networks as input to Conditional Random Field (CRF) to model dependencies in the output. We collected clue words for three well-studied datasets. Extensive experiments demonstrate that our model outperforms the baseline models with a large margin, especially when the labeled data is insufficient.
Keywords:
Data Mining: Information Retrieval
Data Mining: Mining Text, Web, Social Media