MHCD_GIETV2

Name: MHCD_GIETV2
Creator: Shazid Wahid Khandakhani
Published: 2024-07-30T15:40:23.747Z
Keywords: Image Capture, Character Extraction, Image Database, Image Classification

Khandakhani, Shazid Wahid; DASH, SACHIKANTA; Padhy, Sasmita; Panda, Rabinarayan

doi:10.17632/3cskdzypxm.1

MHCD_GIETV2

Published: 30 July 2024| Version 1 | DOI: 10.17632/3cskdzypxm.1

Contributors:

, SACHIKANTA DASH, Sasmita Padhy, Rabinarayan Panda

Description

Creation of Handwritten Marathi Consonant Character Dataset by Data Collection, Annotation, Bounding Box, Threshold, Augmented and creating images of equal size for research purpose Marathi Handwritten Character (Consonant) dataset consists of Annotate data that consists of 80, 040 simple character examples of image data. The original scanned images from Handwritten Character were re-sized and normalized to fit in a 128x128 pixel box while preserving their aspect ratio. Four folders are available: • Original Images consists of 20, 010 • Grayscale Images consists of 20, 010 • Binary Images consists of 20, 010 • Inverted Images consists of 20, 010

Files

Steps to reproduce

The data was collected from different ages group of people including student from Primary, High School and Colleges where the students were given a A4 size paper to write the Consonant Character in their own hand writing so that we can predict the handwritten characters of the people. Then the images where annotated and classes were defined using online tools of annotation. The classes and their respective images where then used to extract data using matlab to get annotate images into different folders. Pre-processing techniques were then implemented to resize and have exact pixel size of 128 by 128 pixel for further evaluation. At last, depending on the size of the data, more augmented images were created keeping in mind the aspect ratio for prediction and training our convolution neural network. Finally, we went ahead with binarization, Gray scaling and inverted images technique for future research work. This dataset will help the research scholars and data scientist who want to deep dive in Marathi Character Recognition.

Institutions

Gandhi Institute of Engineering and Technology

MHCD_GIETV2

Description

Files

Steps to reproduce

Institutions

Categories

Licence