Abstract : The recognition of mid-18th to mid-20th century piano scores presents segmentation challenges caused by touching and broken symbols produced by imprinting techniques and time degradation. We present a new notehead accidental dataset containing 2955 images from dense and damaged piano scores. We address this detection problem with very small training samples using a simple Spatial Transformer (ST)-based Convolutional Neural Network detector improved through bootstrapping and contextual information, and more powerful deep learning detectors (Faster R-CNN, R-FCN, and SSD) with transfer-learning on the COCO dataset. We trained all our detectors using 5 fold cross-validation and obtain 98.73% mean Average Precision (mAP) for an Intersection over Union (IoU) threshold of 0.75 with our best detector. Our ST-based detector obtains a slightly lower mAP of 94.81%, but runs 40 times faster, and uses 18 times less memory.
https://hal.archives-ouvertes.fr/hal-02430041
Contributor : Kwon-Young Choi <>
Submitted on : Tuesday, January 7, 2020 - 9:44:26 AM Last modification on : Thursday, January 7, 2021 - 4:35:30 PM
Kwon-Young Choi, Bertrand Couasnon, Yann Ricquebourg, Richard Zanibbi. CNN-Based Accidental Detection in Dense Printed Piano Scores. 15th International Conference on Document Analysis and Recognition, Sep 2019, Sydney, Australia. ⟨hal-02430041⟩