Information Retrieval Dataset - Internet Movie Database (IMDB)

Published: 8 June 2017| Version 2 | DOI: 10.17632/rth2kr5hxf.2
Contributor:
Renato Alves

Description

This dataset was constructed for an Information Retrieval research project to obtain a master's degree at the Federal University of Rio de Janeiro (UFRJ). It consists of a collection of nearly 115,000 documents in XML format, being a subset of the Internet Movie Database (IMDB). Each XML file contains the following information about one movie in the collection: · ID · Title · Year · Country · Actors (and their roles) · Actresses (and their roles) · Genre · Color Info · Language · Sound Info · Directors · Writers · Composers · Certificates (by country) · Duration · Shooting location (cities and countries) · Editors · Release date (by country) · Producers · Type (film, TV series, etc.) · Keywords · Plot

Files

Steps to reproduce

In order to use the collection, simply unzip the attached file.

Institutions

Universidade Federal do Rio de Janeiro

Categories

Cinema, Information Retrieval, Search Engine, Web Search Engine, File Searching, Navigation, Cluster Testing, Search

License