Comprehensive Dataset of Mandarin Leaf Varieties.

Published: 8 July 2024| Version 1 | DOI: 10.17632/8bvv2pr2d3.1
, Mushfiqur Rahman,


Dataset Description: This dataset contains information on four varieties of mandarin leaves collected from Bangladesh. The varieties included are China Misti, Mandarin, Darjeeling, and Nagpuri. Unique features characterize each array. The data was collected through field surveys and direct sampling from Mandarin orchards. The datasets were made with four varieties of healthy leaves, including China Mishti(511 leaves), Mandarin(540 leaves), Darjeeling(500 leaves), and Nagpuri(366 leaves). The datasets are built for research purposes. For that reason, it was divided into final augmentation, test, training and validation parts. The datasets are organized in a format that anyone can easily use. The categorize folder helps to understand the use of images in research work. The data set provides a file format (JPG) with four separate varieties in the folder. Usages: The Bangladesh Mandarin Varieties dataset offers extensive applications across various fields, featuring detailed information on China Mishti, Mendaring, Darjeeling, and Nagpuri Mandarin types. In data science and machine learning, the dataset can be used for predictive analytics, classification, clustering, and deep learning applications such as image recognition and yield prediction. Environmental scientists can study the impact of climate and sustainable practices on leaf health. Additionally, the dataset serves educational purposes, providing a practical resource for teaching and student projects in data analysis and agricultural science. Data sources: Most of the data was collected from Natore, Bangladesh. Few data were collected from Dhaka. Data Size: The file size is 7.77 GB and has four folders of Mandarin varieties, such as China Mishti, Mendaring, Darjeeling, and Nagpuri where the image dimension is 2608*4624 pixels. After the primary data process, we made eight (8) thousand data in augmentation. There are two (2) thousand data in each Mandarin variety: China Mishti, Nagpuri, Darjeeling, and Mendaring. Then, the augmented data was separated into three sub-folders, with 80% in the test folder,10% in the train folder, and another 10% for validation with a total storage size of 1.18 GB.



Daffodil International University


Computer Vision, Image Processing, Image Classification, Deep Learning