Igbo Alphabets

Published: 18 July 2022| Version 1 | DOI: 10.17632/jccjsk6pd3.1
Lennox Charles


This is a dataset that contains all Igbo alphabets from start to finish and can be used for character recognition. It was recorded physically and has been binarized. The handwriting of 50 students was captured for both uppercase and lowercase, summing up to 3600 images in total. The data is duplicated to sum up to 7200 images. The dataset file: This file contains the raw images of the dataset; that is why it is the largest file. The binary file: This contains the raw data converted into binary format with a threshold of 210. This is why it is the smallest file. The sorted file: This file contains the sorted 7200 images, i.e., a folder was created for all the 'A' alphabets and so on till 'Z'. That is why it is different from the binary file. All you have to do is download the one you choose to use, and then unzip.


Steps to reproduce

I created an 8 x 9 table with equal squares. Then each alphabet was recorded in each box. A total of 50 students' handwriting was recorded. After that, I cropped each square using CorelDRAW graphics design as a tool to speed up the process. The snipping tool on your computer can also be used in case CorelDRAW doesn't install. Labelling each image was crucial to my next step. I wrote an algorithm that will sort each image based on the label and group them together. After the grouping, I wrote another algorithm that will binarize all the images in each folder.


Kwara State University


Optical Character Recognition