Hold out method for CVC-MUSICIMA

Published: 14-08-2017| Version 1 | DOI: 10.17632/jtyrxby8gd.1
Bionifo Labic


This is a dataset derived from CVC-MUSICIMA (you can find the original dataset in http://www.cvc.uab.es/cvcmuscima/index_database.html). This dataset is a handwritten music score image dataset designed for staff removal and writer identification. In the classification task, there are a total of 50 classes, i.e. authors. Each author transcribes the same 20 music scores. In total, the dataset contains 1.000 images of roughly 3.500x1.600 pixels with some variations. The dataset offers images with and without staff lines. By using our patch extraction strategy, each music score produces an average of 250 patches. The data is split in the following manner: the first 17 music scores of each writer are used for training, whilst the remaining 3 are used for testing. Hence, 850 images of music scores (227.560 patches) were used to train and 150 images of music scores (45.063 patches) were used to test our approach.