KARD - Kinect Activity Recognition Dataset

Published: 18 October 2022| Version 2 | DOI: 10.17632/k28dtm7tr6.2
Contributors:
Marco Morana,
,

Description

To cite this dataset, please refer to the following paper: Human Activity Recognition Process Using 3-D Posture Data. S. Gaglio, G. Lo Re, M. Morana. In IEEE Transactions on Human-Machine Systems. 2014 doi: 10.1109/THMS.2014.2377111 ********************** ********************** ********************** KARD contains 18 Activities. Each activity is performed 3 times by 10 different subjects. 1 Horizontal arm wave 2 High arm wave 3 Two hand wave 4 Catch Cap 5 High throw 6 Draw X 7 Draw Tick 8 Toss Paper 9 Forward Kick 10 Side Kick 11 Take Umbrella 12 Bend 13 Hand Clap 14 Walk 15 Phone Call 16 Drink 17 Sit down 18 Stand up In total, you have 4 (files) x 18 (activities) x 3 (repetitions) x 10 (subjects), that is 2160 files. Each filename is in the form aA_sS_eN_string where A is a two-digit actionID and S is a two-digit subjectID for the N-th repetition. The string parameter depends on the the type of provided information: - depthmaps.txt: depth map, - .mp4: 640x480 RGB video, - realworld.txt: joints position in real world coordinates, - screen.txt: joints position in screen coordinates and depth value. For example, the file a04_s03_e02_realworld.txt contains the skeleton joints position in real world coordinates for the second repetition of the action #4 performed by the subject #3. The files containing the skeleton coordinates (realworld.txt and screen.txt) list the 15 joints in consecutive blocks, one for each frame. line 1 Head line 2 Neck line 3 Right Shoulder line 4 Right Elbow line 5 Right Hand line 6 Left Shoulder line 7 Left Elbow line 8 Left Hand line 9 Torso line 10 Right Hip line 11 Right Knee line 12 Right Foot line 13 Left Hip line 14 Left Knee line 15 Left Foot Each file contains 15xF lines, where F is the number of frames for that sequence, and each line reports three numbers: real world coordinates (x, y, z) for realworld.txt, or screen coordinates and depth value (u, v, depth) for screen.txt. The dataset is made of 540 sequences for about a total of 1 hour of videos captured at a resolution of 640x480 pixels at 30fps. Uncompressed frame images are also available on request.

Files

Steps to reproduce

If you use this dataset, please cite the following paper: Human Activity Recognition Process Using 3-D Posture Data. S. Gaglio, G. Lo Re, M. Morana. In IEEE Transactions on Human-Machine Systems. 2014 doi: 10.1109/THMS.2014.2377111

Categories

Computer Science, Artificial Intelligence, Activity Recognition, 3D Imaging, Ambient Intelligence

Licence