UBD Herbarium - Dataset of Intact Leaves for Reconstruction
Description
This dataset consists of a total of 2933 augmented individual intact herbarium leaves with a uniform background collected from Universiti Brunei Darussalam Herbarium (UBDH). These leaves were automatically extracted using a deep learning pipeline consisting of semantic segmentation model, connected component analysis and a single-leaf classifier trained on binary images to automatically detect intact leaves. All leaves were collected from ten different plant families namely Anacardiaceae, Annonaceae, Dipterocarpaceae, Ebenaceae, Euphorbiaceae, Malvaceae, Phyllanthaceae, Polygalaceae, Rubiaceae and Sapotaceae. Family Name Training Validation Testing Anacardiaceae 209 29 61 Annonaceae 199 28 58 Dipterocarpaceae 202 28 59 Ebenaceae 201 30 60 Euphorbiaceae 207 29 61 Malvaceae 207 29 61 Phyllanthaceae 210 30 60 Polygalaceae 200 28 58 Rubiaceae 205 29 60 Sapotaceae 200 28 58 Total 2040 288 596 Reconstruction of damaged herbarium leaves using deep learning techniques for improving classification accuracy, Ecological Informatics Available online 2 February 2021, 101243, https://doi.org/10.1016/j.ecoinf.2021.101243