EduVid-QA: An Annotated Question–Answering Dataset for Educational YouTube Videos

Published: 15 December 2025| Version 1 | DOI: 10.17632/yt4nmz9mcv.1
Contributor:
Naveed Ejaz

Description

EduVid-QA is an annotated question–answering dataset built from publicly available educational YouTube videos. The dataset contains link to the videos with aligned transcripts, manually created multiple-choice and open-ended questions, and verified reference answers grounded in the video and transcript content. Only video links and derived annotations are provided; the videos themselves are not redistributed. EduVid-QA is intended for research on educational video understanding, multimodal reasoning, and retrieval-based question answering.

Files

Steps to reproduce

1. Download annotations and video links from Mendeley Data. 2. Access videos via the provided YouTube links. 3. Generate the transcripts and optional frame extraction. 4. Evaluate QA models on the predefined splits. 5. Compare predictions with reference answers using standard QA metrics. 6. Videos, Keyframes and transcripts are not provided because of copyrights issues.

Institutions

  • University of Ulster at Belfast
  • Iqra University - Islamabad Campus

Categories

Artificial Intelligence, Computer Vision, Natural Language Processing

Licence