Recent Advances in End-to-End Simultaneous Speech Translation
Recent Advances in End-to-End Simultaneous Speech Translation
Xiaoqian Liu, Guoqiang Hu, Yangfan Du, Erfeng He, YingFeng Luo, Chen Xu, Tong Xiao, Jingbo Zhu
Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence
Survey Track. Pages 8142-8150.
https://doi.org/10.24963/ijcai.2024/900
Simultaneous speech translation (SimulST) is a demanding task that involves generating translations in real-time while continuously processing speech input. This paper offers a comprehensive overview of the recent developments in SimulST research, focusing on four major challenges. Firstly, the complexities associated with processing lengthy and continuous speech streams pose significant hurdles. Secondly, satisfying real-time requirements presents inherent difficulties due to the need for immediate translation output. Thirdly, striking a balance between translation quality and latency constraints remains a critical challenge. Finally, the scarcity of annotated data adds another layer of complexity to the task. Through our exploration of these challenges and the proposed solutions, we aim to provide valuable insights into the current landscape of SimulST research and suggest promising directions for future exploration.
Keywords:
Natural Language Processing: General
Natural Language Processing: NLP: Speech