Phishing Website Detection Dataset

Published: 4 October 2022| Version 1 | DOI: 10.17632/kvpkc4j658.1
Contributor:
Pham Tuan Dung

Description

The data contains both phishing/malign URL and clean/benign URL. The phishing URLs are crawled from phishtank.org while the clean data comes from commoncrawl.org. The data is split into 2 set: Training set and Test set. The Training set has 2 million URLs for each phishing and clean data. The Test set has 1 Million phish URLs and 1 million clean URLs.

Files

Categories

Machine Learning, Data Analysis

Licence