MHCD_GIETV2
Description
Creation of Handwritten Marathi Consonant Character Dataset by Data Collection, Annotation, Bounding Box, Threshold, Augmented and creating images of equal size for research purpose Marathi Handwritten Character (Consonant) dataset consists of Annotate data that consists of 80, 040 simple character examples of image data. The original scanned images from Handwritten Character were re-sized and normalized to fit in a 128x128 pixel box while preserving their aspect ratio. Four folders are available: • Original Images consists of 20, 010 • Grayscale Images consists of 20, 010 • Binary Images consists of 20, 010 • Inverted Images consists of 20, 010
Files
Steps to reproduce
The data was collected from different ages group of people including student from Primary, High School and Colleges where the students were given a A4 size paper to write the Consonant Character in their own hand writing so that we can predict the handwritten characters of the people. Then the images where annotated and classes were defined using online tools of annotation. The classes and their respective images where then used to extract data using matlab to get annotate images into different folders. Pre-processing techniques were then implemented to resize and have exact pixel size of 128 by 128 pixel for further evaluation. At last, depending on the size of the data, more augmented images were created keeping in mind the aspect ratio for prediction and training our convolution neural network. Finally, we went ahead with binarization, Gray scaling and inverted images technique for future research work. This dataset will help the research scholars and data scientist who want to deep dive in Marathi Character Recognition.