EduVid-QA: An Annotated Question–Answering Dataset for Educational YouTube Videos

Name: EduVid-QA: An Annotated Question–Answering Dataset for Educational YouTube Videos
Creator: Naveed Ejaz
Published: 2025-12-15T14:48:29.375Z
Keywords: Artificial Intelligence, Computer Vision, Natural Language Processing

Ejaz, Naveed

doi:10.17632/yt4nmz9mcv.1

EduVid-QA: An Annotated Question–Answering Dataset for Educational YouTube Videos

Published: 15 December 2025| Version 1 | DOI: 10.17632/yt4nmz9mcv.1

Contributor:

Naveed Ejaz

Description

EduVid-QA is an annotated question–answering dataset built from publicly available educational YouTube videos. The dataset contains link to the videos with aligned transcripts, manually created multiple-choice and open-ended questions, and verified reference answers grounded in the video and transcript content. Only video links and derived annotations are provided; the videos themselves are not redistributed. EduVid-QA is intended for research on educational video understanding, multimodal reasoning, and retrieval-based question answering.

Files

Steps to reproduce

1. Download annotations and video links from Mendeley Data. 2. Access videos via the provided YouTube links. 3. Generate the transcripts and optional frame extraction. 4. Evaluate QA models on the predefined splits. 5. Compare predictions with reference answers using standard QA metrics. 6. Videos, Keyframes and transcripts are not provided because of copyrights issues.

Institutions

University of Ulster at Belfast
Iqra University - Islamabad Campus

EduVid-QA: An Annotated Question–Answering Dataset for Educational YouTube Videos

Description

Files

Steps to reproduce

Institutions

Categories

Licence