Information Retrieval Dataset - Internet Movie Database (IMDB)

Published: 8 June 2017| Version 2 | DOI: 10.17632/rth2kr5hxf.2
Renato Alves


This dataset was constructed for an Information Retrieval research project to obtain a master's degree at the Federal University of Rio de Janeiro (UFRJ). It consists of a collection of nearly 115,000 documents in XML format, being a subset of the Internet Movie Database (IMDB). Each XML file contains the following information about one movie in the collection: · ID · Title · Year · Country · Actors (and their roles) · Actresses (and their roles) · Genre · Color Info · Language · Sound Info · Directors · Writers · Composers · Certificates (by country) · Duration · Shooting location (cities and countries) · Editors · Release date (by country) · Producers · Type (film, TV series, etc.) · Keywords · Plot


Steps to reproduce

In order to use the collection, simply unzip the attached file.