THVD (Talking Head Video Dataset)

Published: 2 April 2025| Version 1 | DOI: 10.17632/ykhw8r7bfx.1
Contributor:
Mario Peedor

Description

**About** We provide a comprehensive talking-head video dataset with over 50,000 videos, totaling more than 500 hours of footage and featuring 23,841 unique identities from around the world. **Distribution** Detailing the format, size, and structure of the dataset: Data Volume: -Total Size: 2.5TB -Total Videos: 47,200 -Identities Covered: 23,000 -Resolution: 60% 4k(1980), 33% fullHD(1080) -Formats: MP4 -Full-length videos with visible mouth movements in every frame. -Minimum face size of 400 pixels. -Video durations range from 20 seconds to 5 minutes. -Faces have not been cut out, full screen videos including backgrounds. **Usage** **This dataset is ideal for a variety of applications**: Face Recognition & Verification: Training and benchmarking facial recognition models. Action Recognition: Identifying human activities and behaviors. Re-Identification (Re-ID): Tracking identities across different videos and environments. Deepfake Detection: Developing methods to detect manipulated videos. Generative AI: Training high-resolution video generation models. Lip Syncing Applications: Enhancing AI-driven lip-syncing models for dubbing and virtual avatars. Background AI Applications: Developing AI models for automated background replacement, segmentation, and enhancement. **Coverage** Explaining the scope and coverage of the dataset: Geographic Coverage: Worldwide Time Range: Time range and size of the videos have been noted in the CSV file. Demographics: Includes information about age, gender, ethnicity, format, resolution, and file size. **Languages Covered (Videos):** English: 23,038 videos Portuguese: 1,346 videos Spanish: 677 videos Norwegian: 1,266 videos Swedish: 1,056 videos Korean: 848 videos Polish: 1,807 videos Indonesian: 1,163 videos French: 1,102 videos German: 1,276 videos Japanese: 1,433 videos Dutch: 1,666 videos Indian: 1,163 videos Czech: 590 videos Chinese: 685 videos Italian: 975 videos **Who Can Use It** List examples of intended users and their use cases: Data Scientists: Training machine learning models for video-based AI applications. Researchers: Studying human behavior, facial analysis, or video AI advancements. Businesses: Developing facial recognition systems, video analytics, or AI-driven media applications. **Additional Notes** Ensure ethical usage and compliance with privacy regulations. The dataset’s quality and scale make it valuable for high-performance AI training. Potential preprocessing (cropping, down sampling) may be needed for different use cases. Dataset has not been completed yet and expands daily, please contact for most up to date CSV file. The dataset has been divided into 100GB zipped files and is hosted on a private server (with the option to upload to the cloud if needed). To verify the dataset's quality, please contact me for the full CSV file. I’d be happy to provide example videos selected by the potential buyer.

Files

Categories

Video, Audio Analysis, Deep Learning

Licence