BdSL47: A Complete Depth-based Bangla Sign Alphabet and Digit Dataset
Description
The BdSL47 is the original dataset constructed under the supervision of the Systems and Software Lab (SSL), Department of Computer Science and Engineering (CSE), Islamic University of Technology (IUT) and Department of Computer Science and Engineering (CSE), United International University (UIU), Dhaka, Bangladesh. The ownership of this dataset belongs to the authors and they have administered the dataset collection processes by taking informed consent from the participants/singers of the Bangla sign Alphabets and Digits. The dataset was constructed by collecting Bangla alphabet signs from 10 signers in a controlled setting. We employed varying factors of the users like age, gender, hand shape, and skin color. We have also incorporated different challenges while collecting data like scaling, translation, hand rotation, hand orientation, lighting ambiance, and background. We have collected webcam images of 47 signs (10 Bangla Sign Digits and 37 Bangla Sign Alphabet) and resized them to 640x480. For each of the RGB image samples, we have detected the hand key points via the MediaPipe library. Then we generate CSV files containing x, y, and depth coordinate values of 21 hand key points extracted from these image samples and constitute 21×3, or 63 input features. The dataset contains 47000 RGB input images of 47 signs (10 digits, 37 letters) of Bangla Sign Language. The images have been processed via the MediaPipe framework, which is designed to detect predefined 21 hand key points from a sample and provide normalized x & y coordinate values and an estimated depth value. The 3D coordinate values were stored in .csv files (1 file contains information on 100 image samples of the same sign). The dataset contains 470 .csv files in total. There are two folders named as “Bangla Sign Language Dataset - Sign Alphabets” and “Bangla Sign Language Dataset - Sign Digits”. Under each folder, user-wise folders are given that contain sign images (input images (raw images in jpg format), and CSV files (normalized 3D coordinates of 21 hand keypoints with corresponding class labels). All the files of around 1.12GB are available for direct download using the ‘Download All’ option by going to the doi: 10.17632/pbb3w3f92y.3. The images and CSV files can be easily read or processed using Python or any other programming language (Python 3.10.3). Please cite our dataset if you have used it in your research following the format, “Rayeed, S M; Tuba, Sidratul Tamzida; Mahmud, Hasan; Mazumder, Mumtahin Habib Ullah; Hossain, Md. Saddam; Hasan, Md. Kamrul (2023), “BdSL47: A Complete Depth-based Bangla Sign Alphabet and Digit Dataset”, Mendeley Data, V3, doi: 10.17632/pbb3w3f92y.3”. The files associated with this dataset are licensed under a Creative Commons Attribution 4.0 International license. The link to the code can be found here: https://github.com/SMRayeed/BdSL47-Recognition.
Files
Steps to reproduce
The code related to dataset processing and classification can be found in the GitHub link: https://github.com/SMRayeed/BdSL47-Recognition
Institutions
Categories
Funding
Institute for Advanced Research (IAR), United International University (UIU), Dhaka, Bangladesh.
UIU-IAR-01-2022-SE-37