Story Ending Prediction by Transferable BERT

Story Ending Prediction by Transferable BERT

Zhongyang Li, Xiao Ding, Ting Liu

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence
Main track. Pages 1800-1806. https://doi.org/10.24963/ijcai.2019/249

Recent advances, such as GPT and BERT, have shown success in incorporating a pre-trained transformer language model and fine-tuning operation to improve downstream NLP systems. However, this framework still has some fundamental problems in effectively incorporating supervised knowledge from other related tasks. In this study, we investigate a transferable BERT (TransBERT) training framework, which can transfer not only general language knowledge from large-scale unlabeled data but also specific kinds of knowledge from various semantically related supervised tasks, for a target task. Particularly, we propose utilizing three kinds of transfer tasks, including natural language inference, sentiment classification, and next action prediction, to further train BERT based on a pre-trained model. This enables the model to get a better initialization for the target task. We take story ending prediction as the target task to conduct experiments. The final result, an accuracy of 91.8%, dramatically outperforms previous state-of-the-art baseline methods. Several comparative experiments give some helpful suggestions on how to select transfer tasks to improve BERT.
Keywords:
Knowledge Representation and Reasoning: Common-Sense Reasoning
Natural Language Processing: Natural Language Processing
Machine Learning: Deep Learning