SYNLIP: Synthetic License Plate Dataset for fingerprint, recognition and character detection

Published: 4 March 2026| Version 2 | DOI: 10.17632/dfzkr3p73v.2
Contributors:
,
,
,
,
,

Description

SYNLIP (SYNthetic LIcense Plate dataset) is a synthetic dataset developed to support research on Automatic License Plate Recognition (ALPR) using computer vision. It integrates three complementary tasks: image matching, LP recognition and character detection. image matching refers to the automatic establishment of correspondences between points, grayscale tones, features, relations, or other entities in images. For image matching the noise added is based on: grayscale, darken and blur images, orientation, occlusion and brightness variations. For LP recognition we use a consistent naming scheme (LP_f#.jpg) with European and US plates format. For character detection we use YOLO format annotations only on license plate characters. In total, SYNLIP includes 400 images, manually annotated with an average size of 260x130 and with YOLO detection format. All images were created using several GenAI tools, and manually edited to introduce noise. The dataset was shuffled and a split of 30/70 yielded 280 images for the fingerprint set (KB) and 120 images for queries to image matching. A different shuffle was made and a split 30/70 to annotate for character detection using bounding boxes. The dataset is distributed as a ZIP archive organized into two main folders: /detection/ - Includes YOLO-format annotations for character detection; /detection/classes.txt - classes for character detection; /detection/images/- Contains images for train and val; /detection/labels/ - Contains labels for train and val; /matching/ - /matching/kb/- Contains images for knowledge base; /matching/test/ - Contains images for queries; This dataset was produced by researchers from the Digital Transformation CoLAB (DTx CoLAB), Guimarães, Portugal. This work was supported under the base funding project of the DTx CoLAB, under the Missão Interface of the Recovery and Resilience Plan (PRR), integrated in the notice 01/C05-i02/2022, which aims to deepen and consolidate the network of interface institutions between the academic, scientific and technological system and the Portuguese business fabric. Version 2: Update ORCID. No change to data files.

Files

Categories

Computer Vision, Object Detection, Object Recognition, Machine Learning, Image Matching

Licence