Digitize-HCD: A Dataset for Digitization of Handwritten Circuit Diagrams

Published: 24 September 2024| Version 1 | DOI: 10.17632/rngcz5wtv8.1
Contributors:
,
,
,
,
,
,

Description

This dataset was developed to support research on digitization of handwritten circuit diagrams. The dataset includes detailed annotations for multiple aspects of handwritten circuit diagrams, such as component symbols, text labels, and port locations. A total of 1277 images of handwritten circuit diagrams drawn by more than 150 volunteers were utilized for developing the dataset. All diagrams were drawn on white A4 paper, using pen or pencil. From the collected drawings of circuit diagrams, images were captured using scanner, with a resolution of 600 dpi. Images derived from this process were utilized to prepare following annotations: 1. Collected images were annotated for classification and localization of component symbols in circuit images. For localizing the components symbols, Axis-Aligned Bounding Boxes were used. A total of 17 distinct component symbol classes were annotated across the 1277 collected images. Annotations are stored in COCO JSON format. Images and annotations are stored in "Component Symbol and Text Label Data" directory. 2. For the same set of circuit diagram images, component text labels were localized by annotating their bounding regions using polygon annotations and text content within each annotated polygon region was manually transcribed. The data is stored in a JSON file using an annotation format adapted from the OCRDataset structure of MMOCR (https://github.com/open-mmlab/mmocr). Annotations for text labels are stored in "Component Symbol and Text Label Data" directory. 3. For each of the 17 component symbol classes, a separate dataset was extracted from the original set of circuit images and their associated symbol detection data. This dataset includes individual component images, each cropped with a bounding box around the symbol, as well as corresponding ground-truth heatmap images that represent the probability distribution of the port locations. The "Component Port Location Data" directory consists of 17 subdirectories, each of which contains port location data specific to the corresponding component symbol.

Files

Institutions

Islamic University of Technology

Categories

Computer Vision, Electronic Circuit, Object Detection, Deep Learning

Licence