The HFD-8000
Description
The HFD-8000 dataset consists of 8000 labeled human face images aimed at addressing the challenges in detecting image forgeries, particularly in distinguishing between real and GAN-generated images. The dataset is structured for binary classification tasks with 6400 images for training, 1600 for testing, and a description file containing the label information. All images are resized to 224x224 pixels, providing a standardized format as JPEG for training deep learning models. This dataset is a valuable resource for advancing research in detecting deepfakes and AI-generated image forgeries.
Files
Steps to reproduce
The real face images were collected from publicly available datasets from Kaggel with CC BY-NC-SA 4.0. license, while GAN-generated images were created using state-of-the-art GAN models. All images were preprocessed and resized to maintain consistency. Labels were generated based on the image's statistical properties. Images: All face images are uniformly resized to 224x224 pixels and saved in JPEG format. Labels: A CSV or JSON file containing image filenames and corresponding binary labels (0 for real, 1 for GAN-generated).