Published: 11 June 2024| Version 1 | DOI: 10.17632/mwdf2s328d.1
Sudipta Chowdhury, Tanmay Sarkar


The dataset consists of over 500 images of beans (Phaseolus vulgaris L), categorized into two classes: "good" and "bad." These images were captured using a Realme Narzo 20 Pro mobile camera against a black background under daylight conditions. **Data Description:** 1. **Classes:** - Good: Represents healthy beans showcasing desirable characteristics such as uniform color, shape, and absence of defects, blemishes, or signs of damage or disease. - Bad: Encompasses beans exhibiting signs of damage, disease, or other undesirable traits such as discoloration, mold, deformities, or pest infestation. 2. **Image Collection:** - The dataset comprises over 500 images, with a substantial number representing both good and bad instances of beans. - Images were captured under consistent daylight conditions to ensure uniformity and minimize environmental variability. - A black background was utilized to enhance bean visibility and isolate the subject. 3. **Data Source:** - Images were captured using a Realme Narzo 20 Pro mobile camera, ensuring consistent image quality and resolution across the dataset. - Daylight conditions were chosen to provide natural lighting, reducing artificial effects on bean appearance. 4. **Annotation:** - Each image is labeled according to its class (good or bad), facilitating supervised learning tasks. - Annotations may include bounding boxes or masks outlining the bean area to aid in localization tasks. 5. **Data Preprocessing:** - Preprocessing techniques such as resizing, normalization, and background removal may have been applied to the images to enhance model performance and reduce computational complexity. - Metadata such as image resolution, format, and capture settings may accompany the dataset for reference. 6. **Data Distribution:** - The dataset maintains a balanced distribution between good and bad beans, ensuring equal representation of both classes. - Randomization techniques may have been employed during data collection and organization to mitigate biases in model training. 7. **Potential Applications:** - The dataset can be utilized for various machine learning tasks, including classification, object detection, and image segmentation, particularly in agricultural applications. - Applications may include automated sorting systems for bean quality control, disease detection, and yield optimization. 8. **Limitations:** - Despite efforts to ensure data consistency and quality, variations in lighting conditions, camera angles, and bean orientation may introduce some degree of variability. - The dataset primarily focuses on beans of the Phaseolus vulgaris L variety and may not generalize well to other bean varieties or environmental conditions.



Biological Classification, Characterization of Food