Dataset of 1000 Images of Malicious and Benign QR codes 2025
Description
This comprehensive dataset comprises 1,000 high-quality QR code images, methodically organized into two distinct categories: 500 QR codes containing verified malicious URLs with documented phishing or malware distribution history 500 QR codes containing legitimate (benign) URLs from trusted domains The dataset addresses a critical gap in contemporary cybersecurity research materials, as most publicly available QR code datasets are significantly outdated and fail to reflect current cyber threat landscapes. Each QR code has been generated at a standardized resolution of 330x330 pixels to facilitate consistent processing across different machine learning frameworks. This collection is specifically curated to support: Binary classification algorithm development and evaluation Computer vision-based phishing detection research QR code security analysis and threat modeling Transfer learning applications in cybersecurity All malicious URLs have been obtained from multiple threat intelligence sources and scanning services, such as Virus Total, URLhaus, PhishTak etc, while benign URLs have been selected from Alexa top-ranked domains to ensure legitimacy. The dataset includes a diverse range of URL structures, encoding densities, and error correction levels to better represent real-world QR code variations. Researchers can utilize this dataset to train, validate, and test machine learning models for automated QR code threat detection, contributing to improved mobile security solutions in an era of increasing QR code adoption for financial transactions and authentication.