Dataset of Arabic Spam and Ham Tweets

Published: 13 June 2023| Version 1 | DOI: 10.17632/86x733xkb8.1
Contributors:
Sanaa Kaddoura, Safaa Henno

Description

The data was analyzed in this article. Please cite it or cite the data article: Kaddoura, S., Alex, S. A., Itani, M., Henno, S., AlNashash, A., & Hemanth, D. J. (2023). Arabic spam tweets classification using deep learning. Neural Computing and Applications, 1-14. The data are collected from Twitter using Twitter API between January 27, 2021, and March 10, 2021. The download tweet information is Tweet ID, DateTime, URL, Tweet Text, User Name, Location, Replied Tweet ID, Replied Tweet User ID, Replied Tweet Username, Retweet Count, Favorite Count, and Favorited. The dataset contains 13241 records. Each record represents a tweet. The tweets are labeled either Ham or Spam. Ham means non-spam tweet. There are 1924 Spam tweets and 11299 Ham tweets. The tweets are unique i.e. there are no repeated tweets records.

Files

Institutions

Zayed University

Categories

Computer Science, Cybersecurity, Data Science, Machine Learning

Funders

  • Zayed University
    United Arab Emirates
    Grant ID: Start-up Grant [Grant Number R20081]

Licence