End-to-End Transition-Based Online Dialogue Disentanglement

End-to-End Transition-Based Online Dialogue Disentanglement

Hui Liu, Zhan Shi, Jia-Chen Gu, Quan Liu, Si Wei, Xiaodan Zhu

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence
Main track. Pages 3868-3874. https://doi.org/10.24963/ijcai.2020/535

Dialogue disentanglement aims to separate intermingled messages into detached sessions. The existing research focuses on two-step architectures, in which a model first retrieves the relationships between two messages and then divides the message stream into separate clusters. Almost all existing work puts significant efforts on selecting features for message-pair classification and clustering, while ignoring the semantic coherence within each session. In this paper, we introduce the first end-to- end transition-based model for online dialogue disentanglement. Our model captures the sequential information of each session as the online algorithm proceeds on processing a dialogue. The coherence in a session is hence modeled when messages are sequentially added into their best-matching sessions. Meanwhile, the research field still lacks data for studying end-to-end dialogue disentanglement, so we construct a large-scale dataset by extracting coherent dialogues from online movie scripts. We evaluate our model on both the dataset we developed and the publicly available Ubuntu IRC dataset [Kummerfeld et al., 2019]. The results show that our model significantly outperforms the existing algorithms. Further experiments demonstrate that our model better captures the sequential semantics and obtains more coherent disentangled sessions.
Keywords:
Natural Language Processing: Dialogue
Natural Language Processing: Information Extraction
Natural Language Processing: Information Retrieval