A Benchmark Dataset for Manipuri Meetei-Mayek Handwritten Character Recognition

Published: 26 September 2019| Version 3 | DOI: 10.17632/3337bdvx3v.3
Contributor:
Pangambam Singh

Description

A benchmark dataset is always required for any machine learning based classification or recognition system. To the best of our knowledge, no benchmark dataset exists for handwritten character recognition of Manipuri Meetei-Mayek script in public domain so far. In this work, we introduce a handwritten Manipuri Meetei-Mayek character dataset which consists of more than 5000 data samples which were collected from a diverse population group among three different districts of Manipur, India (Imphal East District, Thoubal District and Kamjong District) during March and April 2019. Each individual was asked to write all the Manipuri characters on one A4-size paper. The recorded responses are scanned with the help of a scanner and then each character is manually segmented from the scanned image. The whole dataset is divided into five categories: 1. Mapi Mayek 2. Lonsum Mayek 3. Cheitap Mayek 4. Cheising Mayek 5. Khutam Mayek. This dataset consists of scanned images of handwritten Manipuri Meetei-Mayek characters in .JPG format as well as in .MAT format.

Files

Institutions

Banaras Hindu University

Categories

Optical Character Recognition, Natural Language Processing, Machine Learning, Pattern Recognition

Licence