Published: 9 December 2021| Version 2 | DOI: 10.17632/ttmmtsgbs8.2


PhishRepo is implemented to fill the data gap in the anti-phishing domain, and it is still at an experimental level. PhishRepo collects the data available here during its testing stage, and the dataset includes verified phishing webpages. The provided dataset contains diverse information sources from the latest phishing pages. The diverse feature-rich data present in the dataset is a current need in the machine learning-based anti-phishing domain to overcome inept learning models in phishing detection. The dataset can be used to analyse significant phishing features, experiment with different feature extraction techniques, effectively try out some representation learning techniques such as deep learning from these raw data at a practical level. The dataset contains an index.csv file, and it will be the main file that should be used when mapping index file content with available folders.


Steps to reproduce

The dataset can be downloaded from the PhishRepo data repository.


University of Moratuwa, Uva Wellassa University


Artificial Intelligence, Data Science, Applied Computing, Machine Learning