Published: 20 May 2024| Version 1 | DOI: 10.17632/zncrtr2yhn.1
som saha, Tanmay Sarkar


The dataset consists of over 500 images of cucumbers (Cucumis sativus L), categorized into two classes: "good" and "bad." These images were captured using a Xiaomi 11i mobile camera against a black background under daylight conditions. **Data Description:** 1. **Classes:** - Good: Represents healthy cucumbers exhibiting desirable characteristics such as uniform shape, color, and absence of blemishes or deformities. - Bad: Encompasses cucumbers displaying signs of damage, disease, or other undesirable traits such as discoloration, rot, bruises, or irregular shapes. 2. **Image Collection:** - The dataset comprises over 500 images, with a substantial number depicting both good and bad instances of cucumbers. - Images were captured under consistent lighting conditions to maintain uniformity and minimize environmental variability. - A black background was used to enhance cucumber visibility and isolate the subject. 3. **Data Source:** - Images were captured using a Xiaomi 11i mobile camera, ensuring consistent image quality and resolution throughout the dataset. - Daylight conditions were chosen to provide natural lighting, reducing artificial effects on cucumber appearance. 4. **Annotation:** - Each image is labeled according to its class (good or bad), facilitating supervised learning tasks. - Annotations may include bounding boxes or masks delineating the cucumber area to aid in localization tasks. 5. **Data Preprocessing:** - Preprocessing steps such as resizing, normalization, and background removal may have been applied to the images to enhance model performance and reduce computational complexity. - Metadata such as image resolution, format, and capture settings may accompany the dataset for reference. 6. **Data Distribution:** - The dataset maintains a balanced distribution between good and bad cucumbers, ensuring equal representation of both classes. - Randomization techniques may have been employed during data collection and organization to mitigate biases in model training. 7. **Potential Applications:** - The dataset can be utilized for various machine learning tasks, including classification, object detection, and image segmentation, particularly in the agricultural domain. - Applications may include automated sorting systems for cucumber quality control, disease detection, and crop management. 8. **Limitations:** - Despite efforts to ensure data consistency and quality, variations in lighting conditions, camera angles, and cucumber orientation may introduce some level of variability. - The dataset primarily focuses on cucumbers of Cucumis sativus L and may not generalize well to other cucumber varieties or environmental conditions.



Biological Classification, Characterization of Food