JEPOO: Highly Accurate Joint Estimation of Pitch, Onset and Offset for Music Information Retrieval
JEPOO: Highly Accurate Joint Estimation of Pitch, Onset and Offset for Music Information Retrieval
Haojie Wei, Jun Yuan, Rui Zhang, Yueguo Chen, Gang Wang
Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence
Main Track. Pages 4892-4902.
https://doi.org/10.24963/ijcai.2023/544
Melody extraction is a core task in music information retrieval, and the estimation of pitch, onset and offset are key sub-tasks in melody extraction. Existing methods have limited accuracy, and work for only one type of data, either single-pitch or multi-pitch. In this paper, we propose a highly accurate method for joint estimation of pitch, onset and offset, named JEPOO. We address the challenges of joint learning optimization and handling both single-pitch and multi-pitch data through novel model design and a new optimization technique named Pareto modulated loss with loss weight regularization. This is the first method that can accurately handle both single-pitch and multi-pitch music data, and even a mix of them. A comprehensive experimental study on a wide range of real datasets shows that JEPOO outperforms state-of-the-art methods by up to 10.6\%, 8.3\% and 10.3\% for the prediction of Pitch, Onset and Offset, respectively, and JEPOO is robust for various types of data and instruments. The ablation study validates the effectiveness of each component of JEPOO.
Keywords:
Multidisciplinary Topics and Applications: MDA: Arts and creativity
Multidisciplinary Topics and Applications: MDA: Entertainment
Multidisciplinary Topics and Applications: MDA: Other