Cry, Laugh, or Angry? A Benchmark Dataset for Computer Vision-Based Approach to Infant Facial Emotion Recognition
Description
This dataset is a meticulously curated dataset designed for infant facial emotion recognition, featuring four primary emotional expressions: Angry, Cry, Laugh, and Normal. The dataset aims to facilitate research in machine learning, deep learning, affective computing, and human-computer interaction by providing a large collection of labeled infant facial images. Primary Data (1600 Images): - Angry: 400 - Cry: 400 - Laugh: 400 - Normal: 400 Data Augmentation & Expanded Dataset (26,143 Images): To enhance the dataset's robustness and expand the dataset, 20 augmentation techniques (including HorizontalFlip, VerticalFlip, Rotate, ShiftScaleRotate, BrightnessContrast, GaussNoise, GaussianBlur, Sharpen, HueSaturationValue, CLAHE, GridDistortion, ElasticTransform, GammaCorrection, MotionBlur, ColorJitter, Emboss, Equalize, Posterize, FogEffect, and RainEffect) were applied randomly. This resulted in a significantly larger dataset with: - Angry: 5,781 - Cry: 6,930 - Laugh: 6,870 - Normal: 6,562 Data Collection & Ethical Considerations: The dataset was collected under strict ethical guidelines to ensure compliance with privacy and data protection laws. Key ethical considerations include: 1. Ethical Approval: The study was reviewed and approved by the Institutional Review Board (IRB) of Daffodil International University under Reference No: REC-FSIT-2024-11-10. 2. Informed Parental Consent: Written consent was obtained from parents before capturing and utilizing infant facial images for research purposes. 3. Privacy Protection: No personally identifiable information (PII) is included in the dataset, and images are strictly used for research in AI-driven emotion recognition. Data Collection Locations & Geographical Diversity: To ensure diversity in infant facial expressions, data collection was conducted across multiple locations in Bangladesh, covering healthcare centers and educational institutions: 1. 250-bed District Sadar Hospital, Sherpur (Latitude: 25.019405 & Longitude: 90.013733) 2. Upazila Health Complex, Baraigram, Natore (Latitude: 24.3083 & Longitude: 89.1700) 3. Char Bhabna Community Clinic, Sherpur (Latitude: 25.0188 & Longitude: 90.0175) 4. Jamiatul Amin Mohammad Al-Islamia Cadet Madrasa, Khagan, Dhaka (Latitude: 23.872856 & Longitude: 90.318947) Face Detection Methodology: To extract the facial regions efficiently, RetinaNet—a deep learning-based object detection model—was employed. The use of RetinaNet ensures precise facial cropping while minimizing background noise and occlusions. Potential Applications: 1. Affective Computing: Understanding infant emotions for smart healthcare and early childhood development. 2. Computer Vision: Training deep learning models for automated infant facial expression recognition. 3. Pediatric & Mental Health Research: Assisting in early autism screening and emotion-aware AI for child psychology. 4. Human-Computer Interaction (HCI): Designing AI-powered assistive technologies for infants.