The HFD-8000

Published: 1 October 2024| Version 1 | DOI: 10.17632/r88c3hr2x6.1
Contributors:
Kazi Rifat Ahmed,
,

Description

The HFD-8000 dataset consists of 8000 labeled human face images aimed at addressing the challenges in detecting image forgeries, particularly in distinguishing between real and GAN-generated images. The dataset is structured for binary classification tasks with 6400 images for training, 1600 for testing, and a description file containing the label information. All images are resized to 224x224 pixels, providing a standardized format as JPEG for training deep learning models. This dataset is a valuable resource for advancing research in detecting deepfakes and AI-generated image forgeries.

Files

Steps to reproduce

The real face images were collected from publicly available datasets from Kaggel with CC BY-NC-SA 4.0. license, while GAN-generated images were created using state-of-the-art GAN models. All images were preprocessed and resized to maintain consistency. Labels were generated based on the image's statistical properties. Images: All face images are uniformly resized to 224x224 pixels and saved in JPEG format. Labels: A CSV or JSON file containing image filenames and corresponding binary labels (0 for real, 1 for GAN-generated).

Categories

Computer Vision, Cybersecurity, Image Processing, Face, Image Classification, Detection Technique, Human, Deep Learning, Forgery, Generative Adversarial Network, Deepfake, Binary Classification

Licence