Tomato_detection

Published: 13 May 2026| Version 1 | DOI: 10.17632/vy5y8p2kdb.1
Contributors:
,
,

Description

This dataset comprises 1,600 tomato images collected for binary classification of Fresh and Rotten tomatoes, intended for use in agricultural quality control and post-harvest inspection systems. The original dataset consisted of 800 images, of which 200 images were reserved for validation and 200 for testing to ensure unbiased evaluation. The remaining 400 training images were augmented using Roboflow to generate 1,200 training samples, resulting in a 75/13/13 train-validation-test split. Each image was pre-processed with auto-orientation (EXIF stripping) and resized to 640×640 pixels using a fit-with-black-edges strategy to maintain aspect ratio. To improve model generalization, the following augmentations were applied to produce 3 versions of each training image: vertical flip, random rotation between −15° and +15°, saturation adjustment between −15% and +15%, exposure adjustment between −10% and +10%, and Gaussian blur of up to 0.5 pixels. The dataset is annotated in YOLOv8 bounding box format with two classes — Fresh and Rotten — and is publicly available under a CC BY 4.0 license via Roboflow Universe.

Files

Steps to reproduce

The dataset was constructed and prepared using the following steps. First, tomato images were collected from various sources, capturing both fresh and rotten tomatoes under different lighting and background conditions. The raw images were uploaded to Roboflow, where each image was manually annotated using bounding boxes to label the two classes: Fresh and Rotten. The annotated dataset originally contained 800 images, which were then split such that 200 images were allocated to the validation set and 200 to the test set, while the remaining 400 images formed the base training set. To increase the size and diversity of the training data, data augmentation was applied via Roboflow, generating 3 augmented versions of each training image through vertical flipping, random rotation between −15° and +15°, saturation adjustment between −15% and +15°, exposure adjustment between −10% and +10%, and Gaussian blur of up to 0.5 pixels, resulting in a final training set of 1,200 images. All images were pre-processed with auto-orientation correction and resized to 640×640 pixels using a fit-with-black-edges method to preserve the original aspect ratio. The final dataset was exported in YOLOv8 format and is accessible via Roboflow Universe at https://universe.roboflow.com/arafatislam1811-gmail-com/tomato_dataset-ku7jj.

Institutions

Categories

Computer Vision, Object Detection, Tomato, Binary Classification

Licence