News Recommendations Dataset: Headlines & Categories.

Published: 12 December 2024| Version 1 | DOI: 10.17632/pk5vs5wjxm.1
Contributor:
Ankur Ray Chayan

Description

This dataset, titled News Recommendations: Headlines & Categories, contains 1,999 records of news articles sourced from various newspapers. It is a versatile resource for machine learning tasks such as text classification, recommendation systems, and natural language processing (NLP). Each entry includes a headline summarizing the news article, the name of the newspaper that published it, a brief description of the article, and its associated categories. There are 209 unique categories in total, ranging from single labels like "Business" and "Education" to multi-label combinations like "Environment, Health" and "Sports, Economy." Additionally, each record includes a link to the full article, offering further context for analysis. This dataset can be utilized for a variety of applications, including building personalized news recommendation systems, performing sentiment analysis, and experimenting with multi-label learning models. Its rich and diverse content makes it ideal for researchers and practitioners exploring real-world data scenarios. To enhance usability, cleaning the category labels for consistency may be a helpful first step. Overall, this dataset provides an excellent opportunity to work with complex textual data in a practical and impactful way.

Files

Categories

Social Sciences, Artificial Intelligence, Information Retrieval, Media Studies, Data Science, Natural Language Processing

Licence