BD-GI-Desserts13

Published: 8 April 2026| Version 1 | DOI: 10.17632/5n328km3rr.1
Contributors:
Dip Kundu,
,

Description

BD-GI-Desserts13 is a fine-grained visual dataset comprising 11,692 high-resolution RGB images (512×512 pixels) spanning thirteen dessert categories that are officially certified under Bangladesh's Geographical Indication (GI) registry. The dataset was constructed to support origin-aware food image recognition research and to address the absence of a dedicated benchmark for GI-protected Bangladeshi sweets.Dataset CoverageImages were collected on-site from authentic GI dessert producers across ten districts of Bangladesh: Bogura, Brahmanbaria, Cumilla, Gopalganj, Muktagacha (Mymensingh), Natore, Netrokona, Sherpur, Tangail, and Kushtia. The thirteen classes are: Bogura Doi Brahmanbaria Chhanamukhi Mishti Cumilla Rashmalai Gopalganj Rosogolla Muktagacha Monda (Brown) Muktagacha Monda (White) Natore Kachagolla Netrokona Balish Mishti Sherpur Chanapayesh Tangail Chomchom Tangail Sondesh (Brown) Tangail Sondesh (White) Kushtia Tiler Khaza

Files

Steps to reproduce

The dataset was constructed through a structured pipeline of field collection, quality filtering, preprocessing, anonymization, annotation, and stratified splitting. Step 1 – Identify GI-Certified Dessert Classes Consult the official Bangladesh GI registry (DPDT) to confirm the thirteen target dessert classes and their corresponding districts. Step 2 – On-Site Image Collection Organize field visits to authentic GI-registered sweet producers, shops, and markets in each target district. Obtain verbal permission from shop owners before photographing. Capture images using consumer-grade smartphones under unconstrained real-world conditions including indoor sweet shop counters, outdoor market stalls, and home or kitchen settings. Photograph each dessert from multiple viewpoints and distances to maximize intra-class visual diversity. Record metadata for every image at the time of capture, including the GI district of origin, capture environment (indoor shop, outdoor stall, or home kitchen), and smartphone device model. Step 3 – Quality Filtering Manually review all captured images and discard those that are blurry, severely over- or under-exposed, partially obstructed, or ambiguous in class identity. Retain only images where the target dessert is clearly and unambiguously visible. Step 4 – Anonymization Inspect all retained images for visible human faces or personally identifiable information and remove or blur any such content before inclusion in the dataset. Step 5 – Preprocessing and Standardization Resize all accepted images to a fixed resolution of 512×512 pixels using bilinear or bicubic interpolation. Save all images in RGB format (3-channel, 8-bit per channel). Apply a consistent file naming convention following the district–dessert pattern, for example Cumilla_Rashmalai_0001.jpg. Step 6 – Dataset Organization and Annotation Organize images into 13 class folders named according to the district–dessert convention. Assign a single image-level class label to each image corresponding to its folder name. Attach the collected metadata as a structured CSV file with columns: filename, class, district, environment, and device. Step 7 – Stratified Train/Validation/Test Split Apply class-stratified sampling to partition the full dataset into three subsets: 70% training, 15% validation, and 15% test. Ensure each class is proportionally represented across all three subsets. Save split assignments as splits.csv with columns: filename, class, and split. Step 8 – Verification Confirm the total image count across all classes sums to 11,692 images. Verify that class distribution is consistent across training, validation, and test splits. Cross-check metadata completeness to ensure every image entry has a valid district, environment, and device record.

Institutions

Categories

Computer Science, Artificial Intelligence, Computer Vision, Image Processing, Dessert, Deep Learning

Licence