WCSE 2019 SUMMER ISBN: 978-981-14-1684-2
DOI: 10.18178/wcse.2019.06.068

Onset-Aware Polyphonic Piano Transcription: A CNN-Based Approach

Sicong Kong, Wei Xu, Wei Liu, Xuan Gong, Juanting Liu, Wenqing Cheng

Abstract— Automatic music transcription (AMT) transforms the musical audio content into symbolic notations, including onsets, offsets and pitches. In this paper, we designed a polyphonic piano transcription system based on Convolutional Neural Network (CNN), and it improves the note-level results. Our proposed method has two advantages: Firstly, A CNN model is used to detect the onset event and align the onsets of the notes into more accurate position. Secondly, the other CNN model is used to detect the onsets of 88 notes. And we improve the model's performance by using dual-channel spectrogram as input, appropriate number of convolution layers and the weights for the positive samples in loss function. The public dataset of MAPS is adopted to train and evaluate. Finally, in the „ENSTDkCl‟ subset, our proposed solution achieves 85.15% on note-level F1-measure. To the best of our knowledge, the result is highest F1-measure scores in the state of art.

Index Terms— polyphonic piano transcription, convolutional neural network, onsets detection, onset alignment

Sicong Kong, Wei Xu, Wei Liu, Xuan Gong, Juanting Liu, Wenqing Cheng
School of Electronic Information and Communications, Huazhong University of Science and Technology, CHINA

[Download]


Cite: Sicong Kong, Wei Xu, Wei Liu, Xuan Gong, Juanting Liu, Wenqing Cheng, "Onset-Aware Polyphonic Piano Transcription: A CNN-Based Approach," Proceedings of 2019 the 9th International Workshop on Computer Science and Engineering, pp. 454-461, Hong Kong, 15-17 June, 2019.