A Benchmark Image Dataset for Scabies Detection_2.0

Published: 27 April 2026| Version 1 | DOI: 10.17632/mg6sb9b8x4.1
Contributors:
Mrithunjoy Das Shiblu,
,
,
,
, Fatema Zahra

Description

This dataset contains clinical skin images for automated scabies detection using deep learning and computer vision techniques. The dataset has been prepared to support research in dermatological image analysis and to accompany a Data in Brief publication. The dataset is organized into two main categories: Original Data – Raw collected skin images Augmented Data – Artificially expanded images generated from the original dataset to improve model robustness Each category includes two classes: Scabies Healthy All images are resized to atleast 512*512 and stored in JPG format to ensure compatibility with standard deep learning frameworks. The original images were collected under a strict privacy protocol. Only affected regions were captured. No facial features or personally identifiable information are included in the dataset. All data have been anonymized prior to release. Data augmentation techniques such as rotation, flipping, scaling, and brightness variation were applied to increase dataset diversity and enhance generalization performance in machine learning models. Augmented images are stored separately from the original images to maintain transparency and reproducibility. This dataset is intended for: Deep learning research Medical image classification Skin disease detection benchmarking Computer vision applications in healthcare The dataset may be used for academic and research purposes in accordance with the specified license. If this dataset is used in research, please cite the associated Data in Brief publication.

Files

Steps to reproduce

The original skin images were collected using a smartphone camera under natural and indoor lighting conditions. Only affected regions were captured to ensure privacy, and no facial or personally identifiable information was included. All images were manually reviewed to remove low-quality or duplicate samples. The images were resized to atleast 512*512 pixels, converted to RGB format, and saved as JPG files using Python-based image processing tools. A structured naming convention was applied for consistency. To increase dataset diversity and improve model generalization, data augmentation techniques such as rotation, flipping, scaling, and brightness adjustment were applied using deep learning libraries. Augmented images are stored separately from the original images to ensure transparency and reproducibility. The dataset is organized into Original and Augmented folders, each containing two classes: Scabies and Healthy.

Institutions

Categories

Engineering Research Data Management

Licence