Zieni dataset for Phishing detection
Published: 4 September 2024| Version 1 | DOI: 10.17632/8mcz8jsgnb.1
Contributor:
Rasha ZieniDescription
This dataset was used for training machine learning models to detect phishing attacks and for studying the explainability of these models. It was published in 2024. The dataset refers to phishing and legitimate websites. Phishing samples have been collected from two sources, namely, PhishTank and Tranco, whereas legitimate samples were collected from Alexa. The dataset is balanced and contains 5,000 phishing and 5,000 legitimate samples, each described by 74 features extracted from the entire URL as well as from the Fully Qualified Domain Name, pathname, filename, and parameters. Of these features, 70 are numerical and four binary. The target variable is also binary.
Files
Categories
Cybersecurity, Machine Learning, Explainable Artificial Intelligence