HANDS: a dataset of static Hand-Gestures for Human-Robot Interaction
Description
The HANDS dataset has been created for human-robot interaction research, and it is composed of spatially and temporally aligned RGB and Depth frames. It contains 12 static single-hand gestures performed with both the right-hand and the left-hand, and 3 static two-hands gestures for a total of 29 unique classes. Five subjects (2 females and 3 males) performed the gestures, each of them with a different background and light conditions. For each subject, 150 RGB frames and their corresponding 150 depth frames per gesture have been collected, for a total of 2400 RGB frames and 2400 depth frames per subject. Data has been collected using a Kinect v2 camera intrinsically calibrated to spatially align RGB data to depth data. The temporal alignment has been performed offline using MATLAB, aligning frames with a maximum temporal distance of 66 ms. We provide our MATLAB scripts to process similar rosbags and align the streams, elaborate a MATLAB Labeling Session, and create the same Annotation files we provide. For users who want to use the annotated data for research, we also provide a Python script showing how to convert the Annotation files into a TensorFlow record. The data is valuable for the field of Computer Vision, especially for the tasks of hand-gesture recognition, human-machine interaction, and hand-pose recognition. The dataset can be used to train Deep Learning models to recognize the gestures in the dataset using only a single modality (RGB or depth) or both at the same time. It is also useful as a reference dataset for benchmarking models. If you use this dataset for your work, please cite the related papers: - https://doi.org/10.1016/j.dib.2021.106791 - https://doi.org/10.1016/j.rcim.2020.102085