QR-DN1.0: A new Distorted and Noisy QRs dataset

Published: 17 August 2021| Version 2 | DOI: 10.17632/t2bdr663ms.2
Contributors:
,
,

Description

Barcodes are playing a significant role in different industries in the recent years and among the two most popular 2D barcodes, the QR code has grown exponentially. The QR-DN1.0 dataset includes 5 categories of QR codes that will cover low to high density levels. Each group has 15 QR codes: 5 images for testing and 10 images for training. After embedding the QRs into 30 color images using blind watermarking techniques and then extracting the QRs from the images taken with the mobile phone camera with three different methods, we will have three groups of 2250 extracted QR images, which provides a total of 6750 distorted and noisy QR images. In each of the mentioned three categories, the data is divided into two parts: testing, with 750 images, and training, with 2250 images. For every distorted QR in the dataset, a non-distorted instance of it is placed as a ground truth. One of the advantages of this data set is that it is real. Because no simulated noise has been added to the images and this dataset is completely derived from the real word challenge of extracting embedded QRs in color images captured from the watermarked image on the screen. It also includes various types of QRs such as single character, short sentence, long sentence, URL and location.

Files

Steps to reproduce

50 QR codes have been selected for the training phase and 25 QR codes for the test phase, which you can see in the folder (… / QR). Each QR image is embedded in 30 host color images. after watermarking, taking photos from watermarked results with different cameras ,screens and lighting conditions and then extracting QRs, a dataset with 1500 noisy QR training data and 750 noisy QR test data with resolution of 512 by 512 pixels, was collected. Considering three extraction approaches which are simple, quadruple and voted, we created individual datasets for each approach.2250 training/test samples for each one and 6750 training/test images in total. you can reach each extracting approach in the root folder with separated subfolders for training and test images.